Dataset statistics
| Number of variables | 41 |
|---|---|
| Number of observations | 59400 |
| Missing cells | 46094 |
| Missing cells (%) | 1.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 19.0 MiB |
| Average record size in memory | 336.0 B |
Variable types
| Numeric | 10 |
|---|---|
| DateTime | 1 |
| Categorical | 28 |
| Boolean | 2 |
recorded_by has constant value "GeoData Consultants Ltd" | Constant |
funder has a high cardinality: 1897 distinct values | High cardinality |
installer has a high cardinality: 2145 distinct values | High cardinality |
wpt_name has a high cardinality: 37400 distinct values | High cardinality |
subvillage has a high cardinality: 19287 distinct values | High cardinality |
lga has a high cardinality: 125 distinct values | High cardinality |
ward has a high cardinality: 2092 distinct values | High cardinality |
scheme_name has a high cardinality: 2696 distinct values | High cardinality |
public_meeting is highly correlated with recorded_by | High correlation |
payment_type is highly correlated with recorded_by and 1 other fields | High correlation |
recorded_by is highly correlated with public_meeting and 21 other fields | High correlation |
quality_group is highly correlated with recorded_by and 1 other fields | High correlation |
source_class is highly correlated with recorded_by and 2 other fields | High correlation |
water_quality is highly correlated with recorded_by and 1 other fields | High correlation |
management_group is highly correlated with recorded_by and 1 other fields | High correlation |
waterpoint_type_group is highly correlated with recorded_by and 1 other fields | High correlation |
source is highly correlated with recorded_by and 2 other fields | High correlation |
permit is highly correlated with recorded_by | High correlation |
extraction_type is highly correlated with recorded_by and 2 other fields | High correlation |
basin is highly correlated with recorded_by | High correlation |
quantity is highly correlated with recorded_by and 1 other fields | High correlation |
scheme_management is highly correlated with recorded_by | High correlation |
waterpoint_type is highly correlated with recorded_by and 1 other fields | High correlation |
status_group is highly correlated with recorded_by | High correlation |
extraction_type_group is highly correlated with recorded_by and 2 other fields | High correlation |
payment is highly correlated with payment_type and 1 other fields | High correlation |
management is highly correlated with recorded_by and 1 other fields | High correlation |
region is highly correlated with recorded_by | High correlation |
quantity_group is highly correlated with recorded_by and 1 other fields | High correlation |
extraction_type_class is highly correlated with recorded_by and 2 other fields | High correlation |
source_type is highly correlated with recorded_by and 2 other fields | High correlation |
funder has 3635 (6.1%) missing values | Missing |
installer has 3655 (6.2%) missing values | Missing |
public_meeting has 3334 (5.6%) missing values | Missing |
scheme_management has 3877 (6.5%) missing values | Missing |
scheme_name has 28166 (47.4%) missing values | Missing |
permit has 3056 (5.1%) missing values | Missing |
amount_tsh is highly skewed (γ1 = 57.80779995) | Skewed |
num_private is highly skewed (γ1 = 91.93374999) | Skewed |
id is uniformly distributed | Uniform |
id has unique values | Unique |
amount_tsh has 41639 (70.1%) zeros | Zeros |
gps_height has 20438 (34.4%) zeros | Zeros |
longitude has 1812 (3.1%) zeros | Zeros |
num_private has 58643 (98.7%) zeros | Zeros |
population has 21381 (36.0%) zeros | Zeros |
construction_year has 20709 (34.9%) zeros | Zeros |
Reproduction
| Analysis started | 2021-04-14 13:51:05.320956 |
|---|---|
| Analysis finished | 2021-04-14 13:52:09.009990 |
| Duration | 1 minute and 3.69 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 59400 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37115.13177 |
|---|---|
| Minimum | 0 |
| Maximum | 74247 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3730.9 |
| Q1 | 18519.75 |
| median | 37061.5 |
| Q3 | 55656.5 |
| 95-th percentile | 70564.05 |
| Maximum | 74247 |
| Range | 74247 |
| Interquartile range (IQR) | 37136.75 |
Descriptive statistics
| Standard deviation | 21453.12837 |
|---|---|
| Coefficient of variation (CV) | 0.5780156866 |
| Kurtosis | -1.201515029 |
| Mean | 37115.13177 |
| Median Absolute Deviation (MAD) | 18568.5 |
| Skewness | 0.00262253035 |
| Sum | 2204638827 |
| Variance | 460236716.9 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 19811 | 1 | < 0.1% |
| 38200 | 1 | < 0.1% |
| 34106 | 1 | < 0.1% |
| 36155 | 1 | < 0.1% |
| 46396 | 1 | < 0.1% |
| 48445 | 1 | < 0.1% |
| 42302 | 1 | < 0.1% |
| 70984 | 1 | < 0.1% |
| 73033 | 1 | < 0.1% |
| Other values (59390) | 59390 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 |
| Value | Count | Frequency (%) |
| 74247 | 1 | |
| 74246 | 1 | |
| 74243 | 1 | |
| 74242 | 1 | |
| 74240 | 1 |
| Distinct | 98 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 317.6503847 |
|---|---|
| Minimum | 0 |
| Maximum | 350000 |
| Zeros | 41639 |
| Zeros (%) | 70.1% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 20 |
| 95-th percentile | 1200 |
| Maximum | 350000 |
| Range | 350000 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 2997.574558 |
|---|---|
| Coefficient of variation (CV) | 9.436709989 |
| Kurtosis | 4903.543102 |
| Mean | 317.6503847 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 57.80779995 |
| Sum | 18868432.85 |
| Variance | 8985453.232 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 41639 | |
| 500 | 3102 | 5.2% |
| 50 | 2472 | 4.2% |
| 1000 | 1488 | 2.5% |
| 20 | 1463 | 2.5% |
| 200 | 1220 | 2.1% |
| 100 | 816 | 1.4% |
| 10 | 806 | 1.4% |
| 30 | 743 | 1.3% |
| 2000 | 704 | 1.2% |
| Other values (88) | 4947 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 41639 | |
| 0.2 | 3 | < 0.1% |
| 0.25 | 1 | < 0.1% |
| 1 | 3 | < 0.1% |
| 2 | 13 | < 0.1% |
| Value | Count | Frequency (%) |
| 350000 | 1 | |
| 250000 | 1 | |
| 200000 | 1 | |
| 170000 | 1 | |
| 138000 | 1 |
date_recorded
Date
| Distinct | 356 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| Minimum | 2002-10-14 00:00:00 |
|---|---|
| Maximum | 2013-12-03 00:00:00 |
| Distinct | 1897 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 3635 |
| Missing (%) | 6.1% |
| Memory size | 928.1 KiB |
| Government Of Tanzania | |
|---|---|
| Danida | 3114 |
| Hesawa | 2202 |
| Rwssp | 1374 |
| World Bank | 1349 |
| Other values (1892) |
Length
| Max length | 30 |
|---|---|
| Median length | 6 |
| Mean length | 9.929902268 |
| Min length | 1 |
Characters and Unicode
| Total characters | 553741 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 974 ? |
|---|---|
| Unique (%) | 1.7% |
Sample
| 1st row | Roman |
|---|---|
| 2nd row | Grumeti |
| 3rd row | Lottery Club |
| 4th row | Unicef |
| 5th row | Action In A |
| Value | Count | Frequency (%) |
| Government Of Tanzania | 9084 | 15.3% |
| Danida | 3114 | 5.2% |
| Hesawa | 2202 | 3.7% |
| Rwssp | 1374 | 2.3% |
| World Bank | 1349 | 2.3% |
| Kkkt | 1287 | 2.2% |
| World Vision | 1246 | 2.1% |
| Unicef | 1057 | 1.8% |
| Tasaf | 877 | 1.5% |
| District Council | 843 | 1.4% |
| Other values (1887) | 33332 | |
| (Missing) | 3635 | 6.1% |
| Value | Count | Frequency (%) |
| of | 9748 | 10.8% |
| government | 9276 | 10.3% |
| tanzania | 9172 | 10.1% |
| danida | 3123 | 3.5% |
| world | 2789 | 3.1% |
| water | 2645 | 2.9% |
| hesawa | 2203 | 2.4% |
| bank | 1416 | 1.6% |
| rwssp | 1376 | 1.5% |
| kkkt | 1370 | 1.5% |
| Other values (2065) | 47254 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 68200 | 12.3% |
| n | 57842 | 10.4% |
| i | 38011 | 6.9% |
| e | 37464 | 6.8% |
| 34673 | 6.3% | |
| r | 27879 | 5.0% |
| t | 23016 | 4.2% |
| o | 22741 | 4.1% |
| s | 17208 | 3.1% |
| d | 15464 | 2.8% |
| Other values (59) | 211243 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 425880 | |
| Uppercase Letter | 89705 | 16.2% |
| Space Separator | 34673 | 6.3% |
| Other Punctuation | 1322 | 0.2% |
| Decimal Number | 803 | 0.1% |
| Open Punctuation | 437 | 0.1% |
| Close Punctuation | 431 | 0.1% |
| Dash Punctuation | 323 | 0.1% |
| Connector Punctuation | 167 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| T | 12110 | |
| G | 10722 | |
| O | 10613 | |
| D | 7928 | 8.8% |
| W | 7352 | 8.2% |
| C | 4679 | 5.2% |
| R | 4454 | 5.0% |
| H | 3462 | 3.9% |
| M | 3135 | 3.5% |
| K | 2962 | 3.3% |
| Other values (16) | 22288 |
| Value | Count | Frequency (%) |
| a | 68200 | |
| n | 57842 | |
| i | 38011 | 8.9% |
| e | 37464 | 8.8% |
| r | 27879 | 6.5% |
| t | 23016 | 5.4% |
| o | 22741 | 5.3% |
| s | 17208 | 4.0% |
| d | 15464 | 3.6% |
| f | 15329 | 3.6% |
| Other values (16) | 102726 |
| Value | Count | Frequency (%) |
| / | 783 | |
| . | 469 | |
| \ | 33 | 2.5% |
| & | 26 | 2.0% |
| ' | 11 | 0.8% |
| Value | Count | Frequency (%) |
| 0 | 793 | |
| 2 | 5 | 0.6% |
| 1 | 2 | 0.2% |
| 9 | 2 | 0.2% |
| 4 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| ( | 434 | |
| [ | 3 | 0.7% |
| Value | Count | Frequency (%) |
| ) | 429 | |
| ] | 2 | 0.5% |
| Value | Count | Frequency (%) |
| 34673 |
| Value | Count | Frequency (%) |
| _ | 167 |
| Value | Count | Frequency (%) |
| - | 323 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 515585 | |
| Common | 38156 | 6.9% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 68200 | 13.2% |
| n | 57842 | 11.2% |
| i | 38011 | 7.4% |
| e | 37464 | 7.3% |
| r | 27879 | 5.4% |
| t | 23016 | 4.5% |
| o | 22741 | 4.4% |
| s | 17208 | 3.3% |
| d | 15464 | 3.0% |
| f | 15329 | 3.0% |
| Other values (42) | 192431 |
| Value | Count | Frequency (%) |
| 34673 | ||
| 0 | 793 | 2.1% |
| / | 783 | 2.1% |
| . | 469 | 1.2% |
| ( | 434 | 1.1% |
| ) | 429 | 1.1% |
| - | 323 | 0.8% |
| _ | 167 | 0.4% |
| \ | 33 | 0.1% |
| & | 26 | 0.1% |
| Other values (7) | 26 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 553741 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 68200 | 12.3% |
| n | 57842 | 10.4% |
| i | 38011 | 6.9% |
| e | 37464 | 6.8% |
| 34673 | 6.3% | |
| r | 27879 | 5.0% |
| t | 23016 | 4.2% |
| o | 22741 | 4.1% |
| s | 17208 | 3.1% |
| d | 15464 | 2.8% |
| Other values (59) | 211243 |
| Distinct | 2428 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 668.2972391 |
|---|---|
| Minimum | -90 |
| Maximum | 2770 |
| Zeros | 20438 |
| Zeros (%) | 34.4% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | -90 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 369 |
| Q3 | 1319.25 |
| 95-th percentile | 1797 |
| Maximum | 2770 |
| Range | 2860 |
| Interquartile range (IQR) | 1319.25 |
Descriptive statistics
| Standard deviation | 693.1163503 |
|---|---|
| Coefficient of variation (CV) | 1.037137833 |
| Kurtosis | -1.292440135 |
| Mean | 668.2972391 |
| Median Absolute Deviation (MAD) | 369 |
| Skewness | 0.462402085 |
| Sum | 39696856 |
| Variance | 480410.2751 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 20438 | |
| -15 | 60 | 0.1% |
| -16 | 55 | 0.1% |
| -13 | 55 | 0.1% |
| 1290 | 52 | 0.1% |
| -20 | 52 | 0.1% |
| -14 | 51 | 0.1% |
| 303 | 51 | 0.1% |
| -18 | 49 | 0.1% |
| -19 | 47 | 0.1% |
| Other values (2418) | 38490 |
| Value | Count | Frequency (%) |
| -90 | 1 | |
| -63 | 2 | |
| -59 | 1 | |
| -57 | 1 | |
| -55 | 1 |
| Value | Count | Frequency (%) |
| 2770 | 1 | |
| 2628 | 1 | |
| 2627 | 1 | |
| 2626 | 2 | |
| 2623 | 1 |
| Distinct | 2145 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 3655 |
| Missing (%) | 6.2% |
| Memory size | 928.1 KiB |
| DWE | |
|---|---|
| Government | 1825 |
| RWE | 1206 |
| Commu | 1060 |
| DANIDA | 1050 |
| Other values (2140) |
Length
| Max length | 30 |
|---|---|
| Median length | 4 |
| Mean length | 6.111202798 |
| Min length | 1 |
Characters and Unicode
| Total characters | 340669 |
|---|---|
| Distinct characters | 70 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1098 ? |
|---|---|
| Unique (%) | 2.0% |
Sample
| 1st row | Roman |
|---|---|
| 2nd row | GRUMETI |
| 3rd row | World vision |
| 4th row | UNICEF |
| 5th row | Artisan |
| Value | Count | Frequency (%) |
| DWE | 17402 | |
| Government | 1825 | 3.1% |
| RWE | 1206 | 2.0% |
| Commu | 1060 | 1.8% |
| DANIDA | 1050 | 1.8% |
| KKKT | 898 | 1.5% |
| Hesawa | 840 | 1.4% |
| 0 | 777 | 1.3% |
| TCRS | 707 | 1.2% |
| Central government | 622 | 1.0% |
| Other values (2135) | 29358 | |
| (Missing) | 3655 | 6.2% |
| Value | Count | Frequency (%) |
| dwe | 17601 | |
| government | 2778 | 4.1% |
| water | 1881 | 2.8% |
| hesawa | 1395 | 2.0% |
| rwe | 1230 | 1.8% |
| district | 1216 | 1.8% |
| kkkt | 1153 | 1.7% |
| council | 1106 | 1.6% |
| commu | 1065 | 1.6% |
| danida | 1051 | 1.5% |
| Other values (1976) | 37806 |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 27595 | 8.1% |
| W | 25849 | 7.6% |
| E | 25389 | 7.5% |
| a | 17343 | 5.1% |
| n | 16558 | 4.9% |
| e | 15500 | 4.5% |
| i | 15053 | 4.4% |
| A | 13668 | 4.0% |
| r | 13377 | 3.9% |
| t | 12904 | 3.8% |
| Other values (60) | 157433 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 167438 | |
| Lowercase Letter | 158190 | |
| Space Separator | 12673 | 3.7% |
| Other Punctuation | 971 | 0.3% |
| Decimal Number | 783 | 0.2% |
| Dash Punctuation | 268 | 0.1% |
| Connector Punctuation | 169 | < 0.1% |
| Open Punctuation | 159 | < 0.1% |
| Close Punctuation | 16 | < 0.1% |
| Currency Symbol | 2 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| D | 27595 | |
| W | 25849 | |
| E | 25389 | |
| A | 13668 | |
| C | 10535 | 6.3% |
| S | 6659 | 4.0% |
| R | 6518 | 3.9% |
| I | 6160 | 3.7% |
| T | 5948 | 3.6% |
| K | 5390 | 3.2% |
| Other values (16) | 33727 |
| Value | Count | Frequency (%) |
| a | 17343 | |
| n | 16558 | |
| e | 15500 | |
| i | 15053 | |
| r | 13377 | |
| t | 12904 | 8.2% |
| o | 12398 | 7.8% |
| m | 9289 | 5.9% |
| l | 6201 | 3.9% |
| s | 6173 | 3.9% |
| Other values (16) | 33394 |
| Value | Count | Frequency (%) |
| / | 670 | |
| . | 238 | 24.5% |
| & | 50 | 5.1% |
| ' | 12 | 1.2% |
| # | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 780 | |
| 1 | 1 | 0.1% |
| 4 | 1 | 0.1% |
| 9 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| } | 13 | |
| ] | 2 | 12.5% |
| ) | 1 | 6.2% |
| Value | Count | Frequency (%) |
| ( | 157 | |
| [ | 2 | 1.3% |
| Value | Count | Frequency (%) |
| 12673 |
| Value | Count | Frequency (%) |
| _ | 169 |
| Value | Count | Frequency (%) |
| - | 268 |
| Value | Count | Frequency (%) |
| $ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 325628 | |
| Common | 15041 | 4.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| D | 27595 | 8.5% |
| W | 25849 | 7.9% |
| E | 25389 | 7.8% |
| a | 17343 | 5.3% |
| n | 16558 | 5.1% |
| e | 15500 | 4.8% |
| i | 15053 | 4.6% |
| A | 13668 | 4.2% |
| r | 13377 | 4.1% |
| t | 12904 | 4.0% |
| Other values (42) | 142392 |
| Value | Count | Frequency (%) |
| 12673 | ||
| 0 | 780 | 5.2% |
| / | 670 | 4.5% |
| - | 268 | 1.8% |
| . | 238 | 1.6% |
| _ | 169 | 1.1% |
| ( | 157 | 1.0% |
| & | 50 | 0.3% |
| } | 13 | 0.1% |
| ' | 12 | 0.1% |
| Other values (8) | 11 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 340669 |
Most frequent character per block
| Value | Count | Frequency (%) |
| D | 27595 | 8.1% |
| W | 25849 | 7.6% |
| E | 25389 | 7.5% |
| a | 17343 | 5.1% |
| n | 16558 | 4.9% |
| e | 15500 | 4.5% |
| i | 15053 | 4.4% |
| A | 13668 | 4.0% |
| r | 13377 | 3.9% |
| t | 12904 | 3.8% |
| Other values (60) | 157433 |
| Distinct | 57516 |
|---|---|
| Distinct (%) | 96.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.07742669 |
|---|---|
| Minimum | 0 |
| Maximum | 40.34519307 |
| Zeros | 1812 |
| Zeros (%) | 3.1% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 30.04066001 |
| Q1 | 33.09034738 |
| median | 34.90874343 |
| Q3 | 37.17838657 |
| 95-th percentile | 39.13323954 |
| Maximum | 40.34519307 |
| Range | 40.34519307 |
| Interquartile range (IQR) | 4.08803919 |
Descriptive statistics
| Standard deviation | 6.567431846 |
|---|---|
| Coefficient of variation (CV) | 0.1927208854 |
| Kurtosis | 19.18703105 |
| Mean | 34.07742669 |
| Median Absolute Deviation (MAD) | 2.032511095 |
| Skewness | -4.191046455 |
| Sum | 2024199.146 |
| Variance | 43.13116105 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1812 | 3.1% |
| 32.9771906 | 2 | < 0.1% |
| 32.91986139 | 2 | < 0.1% |
| 37.54278497 | 2 | < 0.1% |
| 39.10530661 | 2 | < 0.1% |
| 32.98478963 | 2 | < 0.1% |
| 39.10375198 | 2 | < 0.1% |
| 37.54157917 | 2 | < 0.1% |
| 37.28135697 | 2 | < 0.1% |
| 37.32890522 | 2 | < 0.1% |
| Other values (57506) | 57570 |
| Value | Count | Frequency (%) |
| 0 | 1812 | |
| 29.6071219 | 1 | < 0.1% |
| 29.60720109 | 1 | < 0.1% |
| 29.61032056 | 1 | < 0.1% |
| 29.61096482 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 40.34519307 | 1 | |
| 40.34430089 | 1 | |
| 40.32523996 | 1 | |
| 40.32522643 | 1 | |
| 40.32340181 | 1 |
latitude
Real number (ℝ)
| Distinct | 57517 |
|---|---|
| Distinct (%) | 96.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.70603266 |
|---|---|
| Minimum | -11.64944018 |
| Maximum | -2 × 108 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | -11.64944018 |
|---|---|
| 5-th percentile | -10.58554992 |
| Q1 | -8.540621305 |
| median | -5.02159665 |
| Q3 | -3.32615564 |
| 95-th percentile | -1.408872227 |
| Maximum | -2 × 108 |
| Range | 11.64944016 |
| Interquartile range (IQR) | 5.214465665 |
Descriptive statistics
| Standard deviation | 2.946019081 |
|---|---|
| Coefficient of variation (CV) | -0.5162990219 |
| Kurtosis | -1.057616666 |
| Mean | -5.70603266 |
| Median Absolute Deviation (MAD) | 2.07002988 |
| Skewness | -0.1520365709 |
| Sum | -338938.34 |
| Variance | 8.679028427 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| -2 × 108 | 1812 | 3.1% |
| -2.49454559 | 2 | < 0.1% |
| -6.98318263 | 2 | < 0.1% |
| -7.05692253 | 2 | < 0.1% |
| -7.05637235 | 2 | < 0.1% |
| -2.48708461 | 2 | < 0.1% |
| -6.98188419 | 2 | < 0.1% |
| -6.97826294 | 2 | < 0.1% |
| -7.06537264 | 2 | < 0.1% |
| -6.99129411 | 2 | < 0.1% |
| Other values (57507) | 57570 |
| Value | Count | Frequency (%) |
| -11.64944018 | 1 | |
| -11.64837759 | 1 | |
| -11.58629656 | 1 | |
| -11.56857679 | 1 | |
| -11.56680457 | 1 |
| Value | Count | Frequency (%) |
| -2 × 108 | 1812 | |
| -0.99846435 | 1 | < 0.1% |
| -0.998916 | 1 | < 0.1% |
| -0.99901209 | 1 | < 0.1% |
| -0.99911702 | 1 | < 0.1% |
| Distinct | 37400 |
|---|---|
| Distinct (%) | 63.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| none | 3563 |
|---|---|
| Shuleni | 1748 |
| Zahanati | 830 |
| Msikitini | 535 |
| Kanisani | 323 |
| Other values (37395) |
Length
| Max length | 30 |
|---|---|
| Median length | 10 |
| Mean length | 10.96210438 |
| Min length | 1 |
Characters and Unicode
| Total characters | 651149 |
|---|---|
| Distinct characters | 75 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 32928 ? |
|---|---|
| Unique (%) | 55.4% |
Sample
| 1st row | none |
|---|---|
| 2nd row | Zahanati |
| 3rd row | Kwa Mahundi |
| 4th row | Zahanati Ya Nanyumbu |
| 5th row | Shuleni |
| Value | Count | Frequency (%) |
| none | 3563 | 6.0% |
| Shuleni | 1748 | 2.9% |
| Zahanati | 830 | 1.4% |
| Msikitini | 535 | 0.9% |
| Kanisani | 323 | 0.5% |
| Bombani | 271 | 0.5% |
| Sokoni | 260 | 0.4% |
| Ofisini | 254 | 0.4% |
| School | 208 | 0.4% |
| Shule Ya Msingi | 199 | 0.3% |
| Other values (37390) | 51209 |
| Value | Count | Frequency (%) |
| kwa | 21384 | 19.6% |
| none | 3565 | 3.3% |
| mzee | 3385 | 3.1% |
| shuleni | 2123 | 1.9% |
| ya | 1499 | 1.4% |
| shule | 1389 | 1.3% |
| school | 1113 | 1.0% |
| primary | 1052 | 1.0% |
| zahanati | 983 | 0.9% |
| msingi | 870 | 0.8% |
| Other values (29461) | 71931 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.0% |
| 49898 | 7.7% | |
| n | 42148 | 6.5% |
| e | 40985 | 6.3% |
| w | 31669 | 4.9% |
| K | 31385 | 4.8% |
| o | 30247 | 4.6% |
| u | 24217 | 3.7% |
| M | 22040 | 3.4% |
| Other values (65) | 227350 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 493422 | |
| Uppercase Letter | 105185 | 16.2% |
| Space Separator | 49898 | 7.7% |
| Decimal Number | 1680 | 0.3% |
| Other Punctuation | 741 | 0.1% |
| Dash Punctuation | 104 | < 0.1% |
| Open Punctuation | 37 | < 0.1% |
| Close Punctuation | 37 | < 0.1% |
| Connector Punctuation | 24 | < 0.1% |
| Modifier Symbol | 21 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | |
| n | 42148 | 8.5% |
| e | 40985 | 8.3% |
| w | 31669 | 6.4% |
| o | 30247 | 6.1% |
| u | 24217 | 4.9% |
| l | 20954 | 4.2% |
| m | 17631 | 3.6% |
| h | 17215 | 3.5% |
| Other values (16) | 117146 |
| Value | Count | Frequency (%) |
| K | 31385 | |
| M | 22040 | |
| S | 10752 | 10.2% |
| N | 4880 | 4.6% |
| A | 3497 | 3.3% |
| B | 3425 | 3.3% |
| C | 2791 | 2.7% |
| P | 2564 | 2.4% |
| L | 2507 | 2.4% |
| J | 2385 | 2.3% |
| Other values (16) | 18959 |
| Value | Count | Frequency (%) |
| 1 | 507 | |
| 2 | 439 | |
| 3 | 152 | 9.0% |
| 4 | 120 | 7.1% |
| 7 | 106 | 6.3% |
| 5 | 86 | 5.1% |
| 6 | 80 | 4.8% |
| 8 | 75 | 4.5% |
| 9 | 70 | 4.2% |
| 0 | 45 | 2.7% |
| Value | Count | Frequency (%) |
| ' | 417 | |
| . | 175 | |
| / | 146 | 19.7% |
| & | 2 | 0.3% |
| \ | 1 | 0.1% |
| Value | Count | Frequency (%) |
| ( | 29 | |
| [ | 8 | 21.6% |
| Value | Count | Frequency (%) |
| ) | 29 | |
| ] | 8 | 21.6% |
| Value | Count | Frequency (%) |
| 49898 |
| Value | Count | Frequency (%) |
| - | 104 |
| Value | Count | Frequency (%) |
| _ | 24 |
| Value | Count | Frequency (%) |
| ` | 21 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 598607 | |
| Common | 52542 | 8.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.8% |
| n | 42148 | 7.0% |
| e | 40985 | 6.8% |
| w | 31669 | 5.3% |
| K | 31385 | 5.2% |
| o | 30247 | 5.1% |
| u | 24217 | 4.0% |
| M | 22040 | 3.7% |
| l | 20954 | 3.5% |
| Other values (42) | 203752 |
| Value | Count | Frequency (%) |
| 49898 | ||
| 1 | 507 | 1.0% |
| 2 | 439 | 0.8% |
| ' | 417 | 0.8% |
| . | 175 | 0.3% |
| 3 | 152 | 0.3% |
| / | 146 | 0.3% |
| 4 | 120 | 0.2% |
| 7 | 106 | 0.2% |
| - | 104 | 0.2% |
| Other values (13) | 478 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 651149 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.0% |
| 49898 | 7.7% | |
| n | 42148 | 6.5% |
| e | 40985 | 6.3% |
| w | 31669 | 4.9% |
| K | 31385 | 4.8% |
| o | 30247 | 4.6% |
| u | 24217 | 3.7% |
| M | 22040 | 3.4% |
| Other values (65) | 227350 |
| Distinct | 65 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4741414141 |
|---|---|
| Minimum | 0 |
| Maximum | 1776 |
| Zeros | 58643 |
| Zeros (%) | 98.7% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 1776 |
| Range | 1776 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 12.23622981 |
|---|---|
| Coefficient of variation (CV) | 25.80713147 |
| Kurtosis | 11137.29521 |
| Mean | 0.4741414141 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 91.93374999 |
| Sum | 28164 |
| Variance | 149.72532 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 58643 | |
| 6 | 81 | 0.1% |
| 1 | 73 | 0.1% |
| 8 | 46 | 0.1% |
| 5 | 46 | 0.1% |
| 32 | 40 | 0.1% |
| 45 | 36 | 0.1% |
| 15 | 35 | 0.1% |
| 39 | 30 | 0.1% |
| 93 | 28 | < 0.1% |
| Other values (55) | 342 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 58643 | |
| 1 | 73 | 0.1% |
| 2 | 23 | < 0.1% |
| 3 | 27 | < 0.1% |
| 4 | 20 | < 0.1% |
| Value | Count | Frequency (%) |
| 1776 | 1 | |
| 1402 | 1 | |
| 755 | 1 | |
| 698 | 1 | |
| 672 | 1 |
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| Lake Victoria | |
|---|---|
| Pangani | |
| Rufiji | |
| Internal | |
| Lake Tanganyika | |
| Other values (4) |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 10.8923569 |
| Min length | 6 |
Characters and Unicode
| Total characters | 647006 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Lake Nyasa |
|---|---|
| 2nd row | Lake Victoria |
| 3rd row | Pangani |
| 4th row | Ruvuma / Southern Coast |
| 5th row | Lake Victoria |
| Value | Count | Frequency (%) |
| Lake Victoria | 10248 | |
| Pangani | 8940 | |
| Rufiji | 7976 | |
| Internal | 7785 | |
| Lake Tanganyika | 6432 | |
| Wami / Ruvu | 5987 | |
| Lake Nyasa | 5085 | |
| Ruvuma / Southern Coast | 4493 | |
| Lake Rukwa | 2454 | 4.1% |
| Value | Count | Frequency (%) |
| lake | 24219 | |
| 10480 | ||
| victoria | 10248 | |
| pangani | 8940 | 8.2% |
| rufiji | 7976 | 7.3% |
| internal | 7785 | 7.1% |
| tanganyika | 6432 | 5.9% |
| wami | 5987 | 5.5% |
| ruvu | 5987 | 5.5% |
| nyasa | 5085 | 4.7% |
| Other values (4) | 15933 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 8.9% |
| n | 50807 | 7.9% |
| 49672 | 7.7% | |
| e | 36497 | 5.6% |
| u | 35883 | 5.5% |
| k | 33105 | 5.1% |
| t | 27019 | 4.2% |
| L | 24219 | 3.7% |
| r | 22526 | 3.5% |
| Other values (22) | 202446 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 488262 | |
| Uppercase Letter | 98592 | 15.2% |
| Space Separator | 49672 | 7.7% |
| Other Punctuation | 10480 | 1.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | |
| n | 50807 | |
| e | 36497 | 7.5% |
| u | 35883 | 7.3% |
| k | 33105 | 6.8% |
| t | 27019 | 5.5% |
| r | 22526 | 4.6% |
| o | 19234 | 3.9% |
| g | 15372 | 3.1% |
| Other values (10) | 82987 |
| Value | Count | Frequency (%) |
| L | 24219 | |
| R | 20910 | |
| V | 10248 | |
| P | 8940 | 9.1% |
| I | 7785 | 7.9% |
| T | 6432 | 6.5% |
| W | 5987 | 6.1% |
| N | 5085 | 5.2% |
| S | 4493 | 4.6% |
| C | 4493 | 4.6% |
| Value | Count | Frequency (%) |
| 49672 |
| Value | Count | Frequency (%) |
| / | 10480 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 586854 | |
| Common | 60152 | 9.3% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 9.9% |
| n | 50807 | 8.7% |
| e | 36497 | 6.2% |
| u | 35883 | 6.1% |
| k | 33105 | 5.6% |
| t | 27019 | 4.6% |
| L | 24219 | 4.1% |
| r | 22526 | 3.8% |
| R | 20910 | 3.6% |
| Other values (20) | 171056 |
| Value | Count | Frequency (%) |
| 49672 | ||
| / | 10480 | 17.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 647006 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 8.9% |
| n | 50807 | 7.9% |
| 49672 | 7.7% | |
| e | 36497 | 5.6% |
| u | 35883 | 5.5% |
| k | 33105 | 5.1% |
| t | 27019 | 4.2% |
| L | 24219 | 3.7% |
| r | 22526 | 3.5% |
| Other values (22) | 202446 |
| Distinct | 19287 |
|---|---|
| Distinct (%) | 32.7% |
| Missing | 371 |
| Missing (%) | 0.6% |
| Memory size | 928.1 KiB |
| Madukani | 508 |
|---|---|
| Shuleni | 506 |
| Majengo | 502 |
| Kati | 373 |
| Mtakuja | 262 |
| Other values (19282) |
Length
| Max length | 30 |
|---|---|
| Median length | 7 |
| Mean length | 7.897592709 |
| Min length | 1 |
Characters and Unicode
| Total characters | 466187 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9424 ? |
|---|---|
| Unique (%) | 16.0% |
Sample
| 1st row | Mnyusi B |
|---|---|
| 2nd row | Nyamara |
| 3rd row | Majengo |
| 4th row | Mahakamani |
| 5th row | Kyanyamisa |
| Value | Count | Frequency (%) |
| Madukani | 508 | 0.9% |
| Shuleni | 506 | 0.9% |
| Majengo | 502 | 0.8% |
| Kati | 373 | 0.6% |
| Mtakuja | 262 | 0.4% |
| Sokoni | 232 | 0.4% |
| M | 187 | 0.3% |
| Muungano | 172 | 0.3% |
| Mbuyuni | 164 | 0.3% |
| Mlimani | 152 | 0.3% |
| Other values (19277) | 55971 | |
| (Missing) | 371 | 0.6% |
| Value | Count | Frequency (%) |
| a | 2387 | 3.4% |
| b | 2043 | 2.9% |
| kati | 1902 | 2.7% |
| majengo | 610 | 0.9% |
| wa | 600 | 0.8% |
| shuleni | 593 | 0.8% |
| madukani | 569 | 0.8% |
| mtaa | 514 | 0.7% |
| juu | 403 | 0.6% |
| mjini | 378 | 0.5% |
| Other values (17024) | 60795 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 9.8% |
| n | 33499 | 7.2% |
| u | 26424 | 5.7% |
| e | 25671 | 5.5% |
| o | 23556 | 5.1% |
| M | 20431 | 4.4% |
| g | 18951 | 4.1% |
| l | 16372 | 3.5% |
| m | 15053 | 3.2% |
| Other values (63) | 168561 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 381263 | |
| Uppercase Letter | 71291 | 15.3% |
| Space Separator | 11766 | 2.5% |
| Other Punctuation | 1184 | 0.3% |
| Decimal Number | 589 | 0.1% |
| Modifier Symbol | 45 | < 0.1% |
| Dash Punctuation | 36 | < 0.1% |
| Open Punctuation | 5 | < 0.1% |
| Close Punctuation | 5 | < 0.1% |
| Connector Punctuation | 3 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | |
| n | 33499 | 8.8% |
| u | 26424 | 6.9% |
| e | 25671 | 6.7% |
| o | 23556 | 6.2% |
| g | 18951 | 5.0% |
| l | 16372 | 4.3% |
| m | 15053 | 3.9% |
| b | 11843 | 3.1% |
| Other values (16) | 92225 |
| Value | Count | Frequency (%) |
| M | 20431 | |
| K | 12545 | |
| N | 6068 | 8.5% |
| B | 5112 | 7.2% |
| I | 4503 | 6.3% |
| S | 4039 | 5.7% |
| A | 3076 | 4.3% |
| C | 2533 | 3.6% |
| L | 2458 | 3.4% |
| U | 1704 | 2.4% |
| Other values (15) | 8822 |
| Value | Count | Frequency (%) |
| 1 | 242 | |
| 2 | 70 | 11.9% |
| 3 | 50 | 8.5% |
| 4 | 49 | 8.3% |
| 6 | 33 | 5.6% |
| 8 | 32 | 5.4% |
| 9 | 32 | 5.4% |
| 0 | 30 | 5.1% |
| 5 | 29 | 4.9% |
| 7 | 22 | 3.7% |
| Value | Count | Frequency (%) |
| ' | 1017 | |
| / | 136 | 11.5% |
| . | 29 | 2.4% |
| # | 2 | 0.2% |
| Value | Count | Frequency (%) |
| ( | 4 | |
| [ | 1 | 20.0% |
| Value | Count | Frequency (%) |
| ) | 4 | |
| ] | 1 | 20.0% |
| Value | Count | Frequency (%) |
| 11766 |
| Value | Count | Frequency (%) |
| ` | 45 |
| Value | Count | Frequency (%) |
| - | 36 |
| Value | Count | Frequency (%) |
| _ | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 452554 | |
| Common | 13633 | 2.9% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 10.1% |
| n | 33499 | 7.4% |
| u | 26424 | 5.8% |
| e | 25671 | 5.7% |
| o | 23556 | 5.2% |
| M | 20431 | 4.5% |
| g | 18951 | 4.2% |
| l | 16372 | 3.6% |
| m | 15053 | 3.3% |
| Other values (41) | 154928 |
| Value | Count | Frequency (%) |
| 11766 | ||
| ' | 1017 | 7.5% |
| 1 | 242 | 1.8% |
| / | 136 | 1.0% |
| 2 | 70 | 0.5% |
| 3 | 50 | 0.4% |
| 4 | 49 | 0.4% |
| ` | 45 | 0.3% |
| - | 36 | 0.3% |
| 6 | 33 | 0.2% |
| Other values (12) | 189 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 466187 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 9.8% |
| n | 33499 | 7.2% |
| u | 26424 | 5.7% |
| e | 25671 | 5.5% |
| o | 23556 | 5.1% |
| M | 20431 | 4.4% |
| g | 18951 | 4.1% |
| l | 16372 | 3.5% |
| m | 15053 | 3.2% |
| Other values (63) | 168561 |
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| Iringa | |
|---|---|
| Shinyanga | |
| Mbeya | |
| Kilimanjaro | |
| Morogoro | |
| Other values (16) |
Length
| Max length | 13 |
|---|---|
| Median length | 6 |
| Mean length | 6.623754209 |
| Min length | 4 |
Characters and Unicode
| Total characters | 393451 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Iringa |
|---|---|
| 2nd row | Mara |
| 3rd row | Manyara |
| 4th row | Mtwara |
| 5th row | Kagera |
| Value | Count | Frequency (%) |
| Iringa | 5294 | 8.9% |
| Shinyanga | 4982 | 8.4% |
| Mbeya | 4639 | 7.8% |
| Kilimanjaro | 4379 | 7.4% |
| Morogoro | 4006 | 6.7% |
| Arusha | 3350 | 5.6% |
| Kagera | 3316 | 5.6% |
| Mwanza | 3102 | 5.2% |
| Kigoma | 2816 | 4.7% |
| Ruvuma | 2640 | 4.4% |
| Other values (11) | 20876 |
| Value | Count | Frequency (%) |
| iringa | 5294 | 8.7% |
| shinyanga | 4982 | 8.2% |
| mbeya | 4639 | 7.6% |
| kilimanjaro | 4379 | 7.2% |
| morogoro | 4006 | 6.6% |
| arusha | 3350 | 5.5% |
| kagera | 3316 | 5.4% |
| mwanza | 3102 | 5.1% |
| kigoma | 2816 | 4.6% |
| ruvuma | 2640 | 4.3% |
| Other values (13) | 22486 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.4% |
| r | 32397 | 8.2% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.8% |
| K | 10511 | 2.7% |
| Other values (22) | 106516 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 331636 | |
| Uppercase Letter | 60205 | 15.3% |
| Space Separator | 1610 | 0.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 10.0% |
| r | 32397 | 9.8% |
| i | 31763 | 9.6% |
| o | 29580 | 8.9% |
| g | 25054 | 7.6% |
| m | 12841 | 3.9% |
| y | 11204 | 3.4% |
| u | 10438 | 3.1% |
| w | 9275 | 2.8% |
| Other values (11) | 52528 |
| Value | Count | Frequency (%) |
| M | 17029 | |
| K | 10511 | |
| S | 7880 | |
| I | 5294 | 8.8% |
| T | 4506 | 7.5% |
| R | 4448 | 7.4% |
| A | 3350 | 5.6% |
| D | 3006 | 5.0% |
| P | 2635 | 4.4% |
| L | 1546 | 2.6% |
| Value | Count | Frequency (%) |
| 1610 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 391841 | |
| Common | 1610 | 0.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.5% |
| r | 32397 | 8.3% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.9% |
| K | 10511 | 2.7% |
| Other values (21) | 104906 |
| Value | Count | Frequency (%) |
| 1610 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 393451 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.4% |
| r | 32397 | 8.2% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.8% |
| K | 10511 | 2.7% |
| Other values (22) | 106516 |
region_code
Real number (ℝ≥0)
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.29700337 |
|---|---|
| Minimum | 1 |
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 12 |
| Q3 | 17 |
| 95-th percentile | 60 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 17.58740634 |
|---|---|
| Coefficient of variation (CV) | 1.149728866 |
| Kurtosis | 10.28843341 |
| Mean | 15.29700337 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 3.17381811 |
| Sum | 908642 |
| Variance | 309.3168617 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 5300 | 8.9% |
| 17 | 5011 | 8.4% |
| 12 | 4639 | 7.8% |
| 3 | 4379 | 7.4% |
| 5 | 4040 | 6.8% |
| 18 | 3324 | 5.6% |
| 19 | 3047 | 5.1% |
| 2 | 3024 | 5.1% |
| 16 | 2816 | 4.7% |
| 10 | 2640 | 4.4% |
| Other values (17) | 21180 |
| Value | Count | Frequency (%) |
| 1 | 2201 | |
| 2 | 3024 | |
| 3 | 4379 | |
| 4 | 2513 | |
| 5 | 4040 |
| Value | Count | Frequency (%) |
| 99 | 423 | 0.7% |
| 90 | 917 | |
| 80 | 1238 | |
| 60 | 1025 | |
| 40 | 1 | < 0.1% |
district_code
Real number (ℝ≥0)
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.629747475 |
|---|---|
| Minimum | 0 |
| Maximum | 80 |
| Zeros | 23 |
| Zeros (%) | < 0.1% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 80 |
| Range | 80 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 9.633648629 |
|---|---|
| Coefficient of variation (CV) | 1.711204396 |
| Kurtosis | 16.21428363 |
| Mean | 5.629747475 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.962045299 |
| Sum | 334407 |
| Variance | 92.80718592 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 12203 | |
| 2 | 11173 | |
| 3 | 9998 | |
| 4 | 8999 | |
| 5 | 4356 | 7.3% |
| 6 | 4074 | 6.9% |
| 7 | 3343 | 5.6% |
| 8 | 1043 | 1.8% |
| 30 | 995 | 1.7% |
| 33 | 874 | 1.5% |
| Other values (10) | 2342 | 3.9% |
| Value | Count | Frequency (%) |
| 0 | 23 | < 0.1% |
| 1 | 12203 | |
| 2 | 11173 | |
| 3 | 9998 | |
| 4 | 8999 |
| Value | Count | Frequency (%) |
| 80 | 12 | < 0.1% |
| 67 | 6 | < 0.1% |
| 63 | 195 | |
| 62 | 109 | |
| 60 | 63 | 0.1% |
| Distinct | 125 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| Njombe | 2503 |
|---|---|
| Arusha Rural | 1252 |
| Moshi Rural | 1251 |
| Bariadi | 1177 |
| Rungwe | 1106 |
| Other values (120) |
Length
| Max length | 16 |
|---|---|
| Median length | 6 |
| Mean length | 7.416885522 |
| Min length | 3 |
Characters and Unicode
| Total characters | 440563 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Ludewa |
|---|---|
| 2nd row | Serengeti |
| 3rd row | Simanjiro |
| 4th row | Nanyumbu |
| 5th row | Karagwe |
| Value | Count | Frequency (%) |
| Njombe | 2503 | 4.2% |
| Arusha Rural | 1252 | 2.1% |
| Moshi Rural | 1251 | 2.1% |
| Bariadi | 1177 | 2.0% |
| Rungwe | 1106 | 1.9% |
| Kilosa | 1094 | 1.8% |
| Kasulu | 1047 | 1.8% |
| Mbozi | 1034 | 1.7% |
| Meru | 1009 | 1.7% |
| Bagamoyo | 997 | 1.7% |
| Other values (115) | 46930 |
| Value | Count | Frequency (%) |
| rural | 9552 | 13.5% |
| njombe | 2503 | 3.5% |
| urban | 1683 | 2.4% |
| moshi | 1330 | 1.9% |
| arusha | 1315 | 1.9% |
| bariadi | 1177 | 1.7% |
| singida | 1172 | 1.7% |
| rungwe | 1106 | 1.6% |
| kilosa | 1094 | 1.5% |
| kasulu | 1047 | 1.5% |
| Other values (106) | 48656 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 6.8% |
| i | 29483 | 6.7% |
| u | 28324 | 6.4% |
| r | 26886 | 6.1% |
| e | 22579 | 5.1% |
| n | 22521 | 5.1% |
| l | 19238 | 4.4% |
| g | 18385 | 4.2% |
| M | 16017 | 3.6% |
| Other values (31) | 157069 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 358693 | |
| Uppercase Letter | 70635 | 16.0% |
| Space Separator | 11235 | 2.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 8.4% |
| i | 29483 | 8.2% |
| u | 28324 | 7.9% |
| r | 26886 | 7.5% |
| e | 22579 | 6.3% |
| n | 22521 | 6.3% |
| l | 19238 | 5.4% |
| g | 18385 | 5.1% |
| m | 15622 | 4.4% |
| Other values (14) | 75594 |
| Value | Count | Frequency (%) |
| M | 16017 | |
| R | 12207 | |
| K | 11663 | |
| S | 6261 | 8.9% |
| N | 5760 | 8.2% |
| B | 4839 | 6.9% |
| U | 3410 | 4.8% |
| I | 2480 | 3.5% |
| L | 2131 | 3.0% |
| T | 1367 | 1.9% |
| Other values (6) | 4500 | 6.4% |
| Value | Count | Frequency (%) |
| 11235 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 429328 | |
| Common | 11235 | 2.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 7.0% |
| i | 29483 | 6.9% |
| u | 28324 | 6.6% |
| r | 26886 | 6.3% |
| e | 22579 | 5.3% |
| n | 22521 | 5.2% |
| l | 19238 | 4.5% |
| g | 18385 | 4.3% |
| M | 16017 | 3.7% |
| Other values (30) | 145834 |
| Value | Count | Frequency (%) |
| 11235 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 440563 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 6.8% |
| i | 29483 | 6.7% |
| u | 28324 | 6.4% |
| r | 26886 | 6.1% |
| e | 22579 | 5.1% |
| n | 22521 | 5.1% |
| l | 19238 | 4.4% |
| g | 18385 | 4.2% |
| M | 16017 | 3.6% |
| Other values (31) | 157069 |
| Distinct | 2092 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| Igosi | 307 |
|---|---|
| Imalinyi | 252 |
| Siha Kati | 232 |
| Mdandu | 231 |
| Nduruma | 217 |
| Other values (2087) |
Length
| Max length | 23 |
|---|---|
| Median length | 7 |
| Mean length | 7.505841751 |
| Min length | 3 |
Characters and Unicode
| Total characters | 445847 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Mundindi |
|---|---|
| 2nd row | Natta |
| 3rd row | Ngorika |
| 4th row | Nanyumbu |
| 5th row | Nyakasimbi |
| Value | Count | Frequency (%) |
| Igosi | 307 | 0.5% |
| Imalinyi | 252 | 0.4% |
| Siha Kati | 232 | 0.4% |
| Mdandu | 231 | 0.4% |
| Nduruma | 217 | 0.4% |
| Kitunda | 203 | 0.3% |
| Mishamo | 203 | 0.3% |
| Msindo | 201 | 0.3% |
| Chalinze | 196 | 0.3% |
| Maji ya Chai | 190 | 0.3% |
| Other values (2082) | 57168 |
| Value | Count | Frequency (%) |
| mashariki | 580 | 0.9% |
| urban | 540 | 0.8% |
| siha | 434 | 0.7% |
| kusini | 393 | 0.6% |
| magharibi | 362 | 0.6% |
| igosi | 307 | 0.5% |
| masama | 303 | 0.5% |
| machame | 293 | 0.5% |
| kati | 270 | 0.4% |
| imalinyi | 252 | 0.4% |
| Other values (2106) | 61033 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.0% |
| n | 29584 | 6.6% |
| u | 27015 | 6.1% |
| o | 26093 | 5.9% |
| e | 23589 | 5.3% |
| g | 21166 | 4.7% |
| M | 18916 | 4.2% |
| m | 16216 | 3.6% |
| l | 15799 | 3.5% |
| Other values (44) | 157693 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 374730 | |
| Uppercase Letter | 64523 | 14.5% |
| Space Separator | 5408 | 1.2% |
| Other Punctuation | 1163 | 0.3% |
| Dash Punctuation | 23 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| M | 18916 | |
| K | 11212 | |
| I | 6094 | 9.4% |
| N | 5919 | 9.2% |
| S | 3354 | 5.2% |
| L | 3162 | 4.9% |
| B | 3098 | 4.8% |
| U | 2913 | 4.5% |
| C | 2123 | 3.3% |
| R | 1692 | 2.6% |
| Other values (15) | 6040 | 9.4% |
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | |
| n | 29584 | 7.9% |
| u | 27015 | 7.2% |
| o | 26093 | 7.0% |
| e | 23589 | 6.3% |
| g | 21166 | 5.6% |
| m | 16216 | 4.3% |
| l | 15799 | 4.2% |
| r | 13057 | 3.5% |
| Other values (15) | 92435 |
| Value | Count | Frequency (%) |
| ' | 1013 | |
| / | 150 | 12.9% |
| Value | Count | Frequency (%) |
| 5408 |
| Value | Count | Frequency (%) |
| - | 23 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 439253 | |
| Common | 6594 | 1.5% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.2% |
| n | 29584 | 6.7% |
| u | 27015 | 6.2% |
| o | 26093 | 5.9% |
| e | 23589 | 5.4% |
| g | 21166 | 4.8% |
| M | 18916 | 4.3% |
| m | 16216 | 3.7% |
| l | 15799 | 3.6% |
| Other values (40) | 151099 |
| Value | Count | Frequency (%) |
| 5408 | ||
| ' | 1013 | 15.4% |
| / | 150 | 2.3% |
| - | 23 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 445847 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.0% |
| n | 29584 | 6.6% |
| u | 27015 | 6.1% |
| o | 26093 | 5.9% |
| e | 23589 | 5.3% |
| g | 21166 | 4.7% |
| M | 18916 | 4.2% |
| m | 16216 | 3.6% |
| l | 15799 | 3.5% |
| Other values (44) | 157693 |
| Distinct | 1049 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 179.9099832 |
|---|---|
| Minimum | 0 |
| Maximum | 30500 |
| Zeros | 21381 |
| Zeros (%) | 36.0% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 25 |
| Q3 | 215 |
| 95-th percentile | 680 |
| Maximum | 30500 |
| Range | 30500 |
| Interquartile range (IQR) | 215 |
Descriptive statistics
| Standard deviation | 471.4821757 |
|---|---|
| Coefficient of variation (CV) | 2.620655994 |
| Kurtosis | 402.2801153 |
| Mean | 179.9099832 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 12.66071359 |
| Sum | 10686653 |
| Variance | 222295.442 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 21381 | |
| 1 | 7025 | 11.8% |
| 200 | 1940 | 3.3% |
| 150 | 1892 | 3.2% |
| 250 | 1681 | 2.8% |
| 300 | 1476 | 2.5% |
| 100 | 1146 | 1.9% |
| 50 | 1139 | 1.9% |
| 500 | 1009 | 1.7% |
| 350 | 986 | 1.7% |
| Other values (1039) | 19725 |
| Value | Count | Frequency (%) |
| 0 | 21381 | |
| 1 | 7025 | 11.8% |
| 2 | 4 | < 0.1% |
| 3 | 4 | < 0.1% |
| 4 | 13 | < 0.1% |
| Value | Count | Frequency (%) |
| 30500 | 1 | < 0.1% |
| 15300 | 1 | < 0.1% |
| 11463 | 1 | < 0.1% |
| 10000 | 3 | |
| 9865 | 1 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3334 |
| Missing (%) | 5.6% |
| Memory size | 928.1 KiB |
| True | |
|---|---|
| False | 5055 |
| (Missing) | 3334 |
| Value | Count | Frequency (%) |
| True | 51011 | |
| False | 5055 | 8.5% |
| (Missing) | 3334 | 5.6% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| GeoData Consultants Ltd |
|---|
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 1366200 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | GeoData Consultants Ltd |
|---|---|
| 2nd row | GeoData Consultants Ltd |
| 3rd row | GeoData Consultants Ltd |
| 4th row | GeoData Consultants Ltd |
| 5th row | GeoData Consultants Ltd |
| Value | Count | Frequency (%) |
| GeoData Consultants Ltd | 59400 |
| Value | Count | Frequency (%) |
| geodata | 59400 | |
| consultants | 59400 | |
| ltd | 59400 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| 118800 | ||
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.3% |
| e | 59400 | 4.3% |
| D | 59400 | 4.3% |
| C | 59400 | 4.3% |
| Other values (4) | 237600 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1009800 | |
| Uppercase Letter | 237600 | 17.4% |
| Space Separator | 118800 | 8.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| n | 118800 | |
| s | 118800 | |
| e | 59400 | 5.9% |
| u | 59400 | 5.9% |
| l | 59400 | 5.9% |
| d | 59400 | 5.9% |
| Value | Count | Frequency (%) |
| G | 59400 | |
| D | 59400 | |
| C | 59400 | |
| L | 59400 |
| Value | Count | Frequency (%) |
| 118800 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1247400 | |
| Common | 118800 | 8.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.8% |
| e | 59400 | 4.8% |
| D | 59400 | 4.8% |
| C | 59400 | 4.8% |
| u | 59400 | 4.8% |
| Other values (3) | 178200 |
| Value | Count | Frequency (%) |
| 118800 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1366200 |
Most frequent character per block
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| 118800 | ||
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.3% |
| e | 59400 | 4.3% |
| D | 59400 | 4.3% |
| C | 59400 | 4.3% |
| Other values (4) | 237600 |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3877 |
| Missing (%) | 6.5% |
| Memory size | 928.1 KiB |
| VWC | |
|---|---|
| WUG | |
| Water authority | 3153 |
| WUA | 2883 |
| Water Board | 2748 |
| Other values (7) |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.644723808 |
| Min length | 3 |
Characters and Unicode
| Total characters | 257889 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | VWC |
|---|---|
| 2nd row | Other |
| 3rd row | VWC |
| 4th row | VWC |
| 5th row | VWC |
| Value | Count | Frequency (%) |
| VWC | 36793 | |
| WUG | 5206 | 8.8% |
| Water authority | 3153 | 5.3% |
| WUA | 2883 | 4.9% |
| Water Board | 2748 | 4.6% |
| Parastatal | 1680 | 2.8% |
| Private operator | 1063 | 1.8% |
| Company | 1061 | 1.8% |
| Other | 766 | 1.3% |
| SWC | 97 | 0.2% |
| Other values (2) | 73 | 0.1% |
| (Missing) | 3877 | 6.5% |
| Value | Count | Frequency (%) |
| vwc | 36793 | |
| water | 5901 | 9.4% |
| wug | 5206 | 8.3% |
| authority | 3153 | 5.0% |
| wua | 2883 | 4.6% |
| board | 2748 | 4.4% |
| parastatal | 1680 | 2.7% |
| private | 1063 | 1.7% |
| operator | 1063 | 1.7% |
| company | 1061 | 1.7% |
| Other values (4) | 936 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.2% |
| r | 17509 | 6.8% |
| o | 9089 | 3.5% |
| e | 8794 | 3.4% |
| U | 8089 | 3.1% |
| 6964 | 2.7% | |
| Other values (19) | 41580 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 148229 | |
| Lowercase Letter | 102696 | |
| Space Separator | 6964 | 2.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 21709 | |
| t | 18531 | |
| r | 17509 | |
| o | 9089 | |
| e | 8794 | |
| i | 4216 | 4.1% |
| y | 4214 | 4.1% |
| h | 3919 | 3.8% |
| u | 3225 | 3.1% |
| d | 2748 | 2.7% |
| Other values (6) | 8742 |
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| U | 8089 | 5.5% |
| G | 5206 | 3.5% |
| A | 2883 | 1.9% |
| B | 2748 | 1.9% |
| P | 2743 | 1.9% |
| O | 766 | 0.5% |
| S | 97 | 0.1% |
| Other values (2) | 73 | < 0.1% |
| Value | Count | Frequency (%) |
| 6964 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 250925 | |
| Common | 6964 | 2.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.4% |
| r | 17509 | 7.0% |
| o | 9089 | 3.6% |
| e | 8794 | 3.5% |
| U | 8089 | 3.2% |
| G | 5206 | 2.1% |
| Other values (18) | 36374 |
| Value | Count | Frequency (%) |
| 6964 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 257889 |
Most frequent character per block
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.2% |
| r | 17509 | 6.8% |
| o | 9089 | 3.5% |
| e | 8794 | 3.4% |
| U | 8089 | 3.1% |
| 6964 | 2.7% | |
| Other values (19) | 41580 |
| Distinct | 2696 |
|---|---|
| Distinct (%) | 8.6% |
| Missing | 28166 |
| Missing (%) | 47.4% |
| Memory size | 928.1 KiB |
| K | 682 |
|---|---|
| None | 644 |
| Borehole | 546 |
| Chalinze wate | 405 |
| M | 400 |
| Other values (2691) |
Length
| Max length | 46 |
|---|---|
| Median length | 13 |
| Mean length | 14.30521227 |
| Min length | 1 |
Characters and Unicode
| Total characters | 446809 |
|---|---|
| Distinct characters | 68 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 712 ? |
|---|---|
| Unique (%) | 2.3% |
Sample
| 1st row | Roman |
|---|---|
| 2nd row | Nyumba ya mungu pipe scheme |
| 3rd row | Zingibali |
| 4th row | BL Bondeni |
| 5th row | None |
| Value | Count | Frequency (%) |
| K | 682 | 1.1% |
| None | 644 | 1.1% |
| Borehole | 546 | 0.9% |
| Chalinze wate | 405 | 0.7% |
| M | 400 | 0.7% |
| DANIDA | 379 | 0.6% |
| Government | 320 | 0.5% |
| Ngana water supplied scheme | 270 | 0.5% |
| wanging'ombe water supply s | 261 | 0.4% |
| wanging'ombe supply scheme | 234 | 0.4% |
| Other values (2686) | 27093 | |
| (Missing) | 28166 |
| Value | Count | Frequency (%) |
| water | 9770 | 13.6% |
| supply | 6745 | 9.4% |
| scheme | 2532 | 3.5% |
| wa | 2157 | 3.0% |
| gravity | 1914 | 2.7% |
| pipe | 1346 | 1.9% |
| maji | 1343 | 1.9% |
| mradi | 1097 | 1.5% |
| line | 1016 | 1.4% |
| supplied | 877 | 1.2% |
| Other values (2506) | 43219 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 48584 | 10.9% |
| 41252 | 9.2% | |
| e | 35239 | 7.9% |
| i | 26411 | 5.9% |
| p | 22451 | 5.0% |
| r | 21816 | 4.9% |
| t | 19216 | 4.3% |
| u | 18441 | 4.1% |
| n | 17760 | 4.0% |
| o | 17418 | 3.9% |
| Other values (58) | 178221 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 353183 | |
| Uppercase Letter | 50064 | 11.2% |
| Space Separator | 41252 | 9.2% |
| Other Punctuation | 1317 | 0.3% |
| Dash Punctuation | 554 | 0.1% |
| Open Punctuation | 191 | < 0.1% |
| Decimal Number | 147 | < 0.1% |
| Modifier Symbol | 70 | < 0.1% |
| Close Punctuation | 31 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 48584 | |
| e | 35239 | 10.0% |
| i | 26411 | 7.5% |
| p | 22451 | 6.4% |
| r | 21816 | 6.2% |
| t | 19216 | 5.4% |
| u | 18441 | 5.2% |
| n | 17760 | 5.0% |
| o | 17418 | 4.9% |
| l | 17308 | 4.9% |
| Other values (16) | 108539 |
| Value | Count | Frequency (%) |
| M | 9314 | |
| K | 5600 | |
| N | 4439 | 8.9% |
| S | 3770 | 7.5% |
| A | 2729 | 5.5% |
| I | 2691 | 5.4% |
| W | 2531 | 5.1% |
| B | 2387 | 4.8% |
| L | 2107 | 4.2% |
| U | 1790 | 3.6% |
| Other values (15) | 12706 |
| Value | Count | Frequency (%) |
| 2 | 61 | |
| 3 | 55 | |
| 7 | 7 | 4.8% |
| 1 | 7 | 4.8% |
| 4 | 7 | 4.8% |
| 5 | 4 | 2.7% |
| 0 | 3 | 2.0% |
| 6 | 3 | 2.0% |
| Value | Count | Frequency (%) |
| ' | 938 | |
| / | 370 | 28.1% |
| & | 8 | 0.6% |
| : | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 41252 |
| Value | Count | Frequency (%) |
| - | 554 |
| Value | Count | Frequency (%) |
| ( | 191 |
| Value | Count | Frequency (%) |
| ) | 31 |
| Value | Count | Frequency (%) |
| ` | 70 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 403247 | |
| Common | 43562 | 9.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 48584 | 12.0% |
| e | 35239 | 8.7% |
| i | 26411 | 6.5% |
| p | 22451 | 5.6% |
| r | 21816 | 5.4% |
| t | 19216 | 4.8% |
| u | 18441 | 4.6% |
| n | 17760 | 4.4% |
| o | 17418 | 4.3% |
| l | 17308 | 4.3% |
| Other values (41) | 158603 |
| Value | Count | Frequency (%) |
| 41252 | ||
| ' | 938 | 2.2% |
| - | 554 | 1.3% |
| / | 370 | 0.8% |
| ( | 191 | 0.4% |
| ` | 70 | 0.2% |
| 2 | 61 | 0.1% |
| 3 | 55 | 0.1% |
| ) | 31 | 0.1% |
| & | 8 | < 0.1% |
| Other values (7) | 32 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 446809 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 48584 | 10.9% |
| 41252 | 9.2% | |
| e | 35239 | 7.9% |
| i | 26411 | 5.9% |
| p | 22451 | 5.0% |
| r | 21816 | 4.9% |
| t | 19216 | 4.3% |
| u | 18441 | 4.1% |
| n | 17760 | 4.0% |
| o | 17418 | 3.9% |
| Other values (58) | 178221 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3056 |
| Missing (%) | 5.1% |
| Memory size | 928.1 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 3056 |
| Value | Count | Frequency (%) |
| True | 38852 | |
| False | 17492 | |
| (Missing) | 3056 | 5.1% |
| Distinct | 55 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1300.652475 |
|---|---|
| Minimum | 0 |
| Maximum | 2013 |
| Zeros | 20709 |
| Zeros (%) | 34.9% |
| Memory size | 928.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1986 |
| Q3 | 2004 |
| 95-th percentile | 2010 |
| Maximum | 2013 |
| Range | 2013 |
| Interquartile range (IQR) | 2004 |
Descriptive statistics
| Standard deviation | 951.6205473 |
|---|---|
| Coefficient of variation (CV) | 0.7316485885 |
| Kurtosis | -1.596432369 |
| Mean | 1300.652475 |
| Median Absolute Deviation (MAD) | 22 |
| Skewness | -0.6349277866 |
| Sum | 77258757 |
| Variance | 905581.6661 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 20709 | |
| 2010 | 2645 | 4.5% |
| 2008 | 2613 | 4.4% |
| 2009 | 2533 | 4.3% |
| 2000 | 2091 | 3.5% |
| 2007 | 1587 | 2.7% |
| 2006 | 1471 | 2.5% |
| 2003 | 1286 | 2.2% |
| 2011 | 1256 | 2.1% |
| 2004 | 1123 | 1.9% |
| Other values (45) | 22086 |
| Value | Count | Frequency (%) |
| 0 | 20709 | |
| 1960 | 102 | 0.2% |
| 1961 | 21 | < 0.1% |
| 1962 | 30 | 0.1% |
| 1963 | 85 | 0.1% |
| Value | Count | Frequency (%) |
| 2013 | 176 | 0.3% |
| 2012 | 1084 | |
| 2011 | 1256 | |
| 2010 | 2645 | |
| 2009 | 2533 |
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | |
| Other values (13) |
Length
| Max length | 25 |
|---|---|
| Median length | 7 |
| Mean length | 7.719511785 |
| Min length | 3 |
Characters and Unicode
| Total characters | 458539 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | gravity |
| 3rd row | gravity |
| 4th row | submersible |
| 5th row | gravity |
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 13.7% |
| other | 6430 | 10.8% |
| submersible | 4764 | 8.0% |
| swn 80 | 3670 | 6.2% |
| mono | 2865 | 4.8% |
| india mark ii | 2400 | 4.0% |
| afridev | 1770 | 3.0% |
| ksb | 1415 | 2.4% |
| other - rope pump | 451 | 0.8% |
| Other values (8) | 701 | 1.2% |
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 11.6% |
| other | 7197 | 10.2% |
| submersible | 4764 | 6.8% |
| swn | 3899 | 5.5% |
| 80 | 3670 | 5.2% |
| mono | 2865 | 4.1% |
| india | 2498 | 3.6% |
| mark | 2498 | 3.6% |
| ii | 2400 | 3.4% |
| Other values (13) | 5640 | 8.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | 6.2% |
| y | 26867 | 5.9% |
| g | 26782 | 5.8% |
| n | 25691 | 5.6% |
| e | 19036 | 4.2% |
| s | 14844 | 3.2% |
| Other values (19) | 96613 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 430853 | |
| Space Separator | 10965 | 2.4% |
| Other Punctuation | 8156 | 1.8% |
| Decimal Number | 7798 | 1.7% |
| Dash Punctuation | 767 | 0.2% |
Most frequent character per category
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | |
| y | 26867 | 6.2% |
| g | 26782 | 6.2% |
| n | 25691 | 6.0% |
| e | 19036 | 4.4% |
| s | 14844 | 3.4% |
| Other values (13) | 68927 |
| Value | Count | Frequency (%) |
| 8 | 3899 | |
| 0 | 3670 | |
| 1 | 229 | 2.9% |
| Value | Count | Frequency (%) |
| 10965 |
| Value | Count | Frequency (%) |
| / | 8156 |
| Value | Count | Frequency (%) |
| - | 767 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 430853 | |
| Common | 27686 | 6.0% |
Most frequent character per script
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | |
| y | 26867 | 6.2% |
| g | 26782 | 6.2% |
| n | 25691 | 6.0% |
| e | 19036 | 4.4% |
| s | 14844 | 3.4% |
| Other values (13) | 68927 |
| Value | Count | Frequency (%) |
| 10965 | ||
| / | 8156 | |
| 8 | 3899 | 14.1% |
| 0 | 3670 | 13.3% |
| - | 767 | 2.8% |
| 1 | 229 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 458539 |
Most frequent character per block
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | 6.2% |
| y | 26867 | 5.9% |
| g | 26782 | 5.8% |
| n | 25691 | 5.6% |
| e | 19036 | 4.2% |
| s | 14844 | 3.2% |
| Other values (19) | 96613 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | |
| Other values (8) |
Length
| Max length | 15 |
|---|---|
| Median length | 7 |
| Mean length | 7.880538721 |
| Min length | 4 |
Characters and Unicode
| Total characters | 468104 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | gravity |
| 3rd row | gravity |
| 4th row | submersible |
| 5th row | gravity |
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 13.7% |
| other | 6430 | 10.8% |
| submersible | 6179 | 10.4% |
| swn 80 | 3670 | 6.2% |
| mono | 2865 | 4.8% |
| india mark ii | 2400 | 4.0% |
| afridev | 1770 | 3.0% |
| rope pump | 451 | 0.8% |
| other handpump | 364 | 0.6% |
| Other values (3) | 337 | 0.6% |
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 11.8% |
| other | 6916 | 10.0% |
| submersible | 6179 | 9.0% |
| swn | 3670 | 5.3% |
| 80 | 3670 | 5.3% |
| mono | 2865 | 4.2% |
| india | 2498 | 3.6% |
| mark | 2498 | 3.6% |
| ii | 2400 | 3.5% |
| Other values (7) | 3373 | 4.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.1% |
| g | 26780 | 5.7% |
| y | 26780 | 5.7% |
| n | 25822 | 5.5% |
| e | 21729 | 4.6% |
| s | 16028 | 3.4% |
| Other values (16) | 99686 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 442890 | |
| Space Separator | 9603 | 2.1% |
| Other Punctuation | 8154 | 1.7% |
| Decimal Number | 7340 | 1.6% |
| Dash Punctuation | 117 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.4% |
| g | 26780 | 6.0% |
| y | 26780 | 6.0% |
| n | 25822 | 5.8% |
| e | 21729 | 4.9% |
| s | 16028 | 3.6% |
| Other values (11) | 74472 |
| Value | Count | Frequency (%) |
| 8 | 3670 | |
| 0 | 3670 |
| Value | Count | Frequency (%) |
| 9603 |
| Value | Count | Frequency (%) |
| / | 8154 |
| Value | Count | Frequency (%) |
| - | 117 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 442890 | |
| Common | 25214 | 5.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.4% |
| g | 26780 | 6.0% |
| y | 26780 | 6.0% |
| n | 25822 | 5.8% |
| e | 21729 | 4.9% |
| s | 16028 | 3.6% |
| Other values (11) | 74472 |
| Value | Count | Frequency (%) |
| 9603 | ||
| / | 8154 | |
| 8 | 3670 | 14.6% |
| 0 | 3670 | 14.6% |
| - | 117 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 468104 |
Most frequent character per block
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.1% |
| g | 26780 | 5.7% |
| y | 26780 | 5.7% |
| n | 25822 | 5.5% |
| e | 21729 | 4.6% |
| s | 16028 | 3.4% |
| Other values (16) | 99686 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| gravity | |
|---|---|
| handpump | |
| other | |
| submersible | |
| motorpump | |
| Other values (2) | 568 |
Length
| Max length | 12 |
|---|---|
| Median length | 7 |
| Mean length | 7.602239057 |
| Min length | 5 |
Characters and Unicode
| Total characters | 451573 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | gravity |
| 3rd row | gravity |
| 4th row | submersible |
| 5th row | gravity |
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| handpump | 16456 | |
| other | 6430 | 10.8% |
| submersible | 6179 | 10.4% |
| motorpump | 2987 | 5.0% |
| rope pump | 451 | 0.8% |
| wind-powered | 117 | 0.2% |
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| handpump | 16456 | |
| other | 6430 | 10.7% |
| submersible | 6179 | 10.3% |
| motorpump | 2987 | 5.0% |
| pump | 451 | 0.8% |
| rope | 451 | 0.8% |
| wind-powered | 117 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| v | 26780 | 5.9% |
| y | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (11) | 120291 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 451005 | |
| Space Separator | 451 | 0.1% |
| Dash Punctuation | 117 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| v | 26780 | 5.9% |
| y | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (9) | 119723 |
| Value | Count | Frequency (%) |
| - | 117 |
| Value | Count | Frequency (%) |
| 451 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 451005 | |
| Common | 568 | 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| v | 26780 | 5.9% |
| y | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (9) | 119723 |
| Value | Count | Frequency (%) |
| 451 | ||
| - | 117 | 20.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 451573 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| v | 26780 | 5.9% |
| y | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (11) | 120291 |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| vwc | |
|---|---|
| wug | |
| water board | 2933 |
| wua | 2535 |
| private operator | 1971 |
| Other values (7) |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.350639731 |
| Min length | 3 |
Characters and Unicode
| Total characters | 258428 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | vwc |
|---|---|
| 2nd row | wug |
| 3rd row | vwc |
| 4th row | vwc |
| 5th row | other |
| Value | Count | Frequency (%) |
| vwc | 40507 | |
| wug | 6515 | 11.0% |
| water board | 2933 | 4.9% |
| wua | 2535 | 4.3% |
| private operator | 1971 | 3.3% |
| parastatal | 1768 | 3.0% |
| water authority | 904 | 1.5% |
| other | 844 | 1.4% |
| company | 685 | 1.2% |
| unknown | 561 | 0.9% |
| Other values (2) | 177 | 0.3% |
| Value | Count | Frequency (%) |
| vwc | 40507 | |
| wug | 6515 | 10.0% |
| water | 3837 | 5.9% |
| board | 2933 | 4.5% |
| wua | 2535 | 3.9% |
| private | 1971 | 3.0% |
| operator | 1971 | 3.0% |
| parastatal | 1768 | 2.7% |
| other | 943 | 1.4% |
| authority | 904 | 1.4% |
| Other values (5) | 1522 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.3% |
| t | 14222 | 5.5% |
| u | 10593 | 4.1% |
| o | 10166 | 3.9% |
| e | 8722 | 3.4% |
| g | 6515 | 2.5% |
| Other values (13) | 32202 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 252323 | |
| Space Separator | 6006 | 2.3% |
| Dash Punctuation | 99 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.5% |
| t | 14222 | 5.6% |
| u | 10593 | 4.2% |
| o | 10166 | 4.0% |
| e | 8722 | 3.5% |
| g | 6515 | 2.6% |
| Other values (11) | 26097 |
| Value | Count | Frequency (%) |
| 6006 |
| Value | Count | Frequency (%) |
| - | 99 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 252323 | |
| Common | 6105 | 2.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.5% |
| t | 14222 | 5.6% |
| u | 10593 | 4.2% |
| o | 10166 | 4.0% |
| e | 8722 | 3.5% |
| g | 6515 | 2.6% |
| Other values (11) | 26097 |
| Value | Count | Frequency (%) |
| 6006 | ||
| - | 99 | 1.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 258428 |
Most frequent character per block
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.3% |
| t | 14222 | 5.5% |
| u | 10593 | 4.1% |
| o | 10166 | 3.9% |
| e | 8722 | 3.4% |
| g | 6515 | 2.5% |
| Other values (13) | 32202 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| user-group | |
|---|---|
| commercial | 3638 |
| parastatal | 1768 |
| other | 943 |
| unknown | 561 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.892289562 |
| Min length | 5 |
Characters and Unicode
| Total characters | 587602 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | user-group |
|---|---|
| 2nd row | user-group |
| 3rd row | user-group |
| 4th row | user-group |
| 5th row | other |
| Value | Count | Frequency (%) |
| user-group | 52490 | |
| commercial | 3638 | 6.1% |
| parastatal | 1768 | 3.0% |
| other | 943 | 1.6% |
| unknown | 561 | 0.9% |
| Value | Count | Frequency (%) |
| user-group | 52490 | |
| commercial | 3638 | 6.1% |
| parastatal | 1768 | 3.0% |
| other | 943 | 1.6% |
| unknown | 561 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| - | 52490 | |
| g | 52490 | |
| a | 10710 | 1.8% |
| c | 7276 | 1.2% |
| Other values (8) | 24547 | 4.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 535112 | |
| Dash Punctuation | 52490 | 8.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| g | 52490 | |
| a | 10710 | 2.0% |
| c | 7276 | 1.4% |
| m | 7276 | 1.4% |
| Other values (7) | 17271 | 3.2% |
| Value | Count | Frequency (%) |
| - | 52490 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 535112 | |
| Common | 52490 | 8.9% |
Most frequent character per script
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| g | 52490 | |
| a | 10710 | 2.0% |
| c | 7276 | 1.4% |
| m | 7276 | 1.4% |
| Other values (7) | 17271 | 3.2% |
| Value | Count | Frequency (%) |
| - | 52490 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 587602 |
Most frequent character per block
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| - | 52490 | |
| g | 52490 | |
| a | 10710 | 1.8% |
| c | 7276 | 1.2% |
| Other values (8) | 24547 | 4.2% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| never pay | |
|---|---|
| pay per bucket | |
| pay monthly | |
| unknown | |
| pay when scheme fails | |
| Other values (2) |
Length
| Max length | 21 |
|---|---|
| Median length | 9 |
| Mean length | 10.66479798 |
| Min length | 5 |
Characters and Unicode
| Total characters | 633489 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | pay annually |
|---|---|
| 2nd row | never pay |
| 3rd row | pay per bucket |
| 4th row | never pay |
| 5th row | never pay |
| Value | Count | Frequency (%) |
| never pay | 25348 | |
| pay per bucket | 8985 | 15.1% |
| pay monthly | 8300 | 14.0% |
| unknown | 8157 | 13.7% |
| pay when scheme fails | 3914 | 6.6% |
| pay annually | 3642 | 6.1% |
| other | 1054 | 1.8% |
| Value | Count | Frequency (%) |
| pay | 50189 | |
| never | 25348 | |
| bucket | 8985 | 7.1% |
| per | 8985 | 7.1% |
| monthly | 8300 | 6.6% |
| unknown | 8157 | 6.5% |
| when | 3914 | 3.1% |
| scheme | 3914 | 3.1% |
| fails | 3914 | 3.1% |
| annually | 3642 | 2.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| 67002 | ||
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 5.6% |
| v | 25348 | 4.0% |
| u | 20784 | 3.3% |
| l | 19498 | 3.1% |
| Other values (11) | 131999 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 566487 | |
| Space Separator | 67002 | 10.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 6.2% |
| v | 25348 | 4.5% |
| u | 20784 | 3.7% |
| l | 19498 | 3.4% |
| t | 18339 | 3.2% |
| Other values (10) | 113660 |
| Value | Count | Frequency (%) |
| 67002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 566487 | |
| Common | 67002 | 10.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 6.2% |
| v | 25348 | 4.5% |
| u | 20784 | 3.7% |
| l | 19498 | 3.4% |
| t | 18339 | 3.2% |
| Other values (10) | 113660 |
| Value | Count | Frequency (%) |
| 67002 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 633489 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| 67002 | ||
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 5.6% |
| v | 25348 | 4.0% |
| u | 20784 | 3.3% |
| l | 19498 | 3.1% |
| Other values (11) | 131999 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| never pay | |
|---|---|
| per bucket | |
| monthly | |
| unknown | |
| on failure | |
| Other values (2) |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.530757576 |
| Min length | 5 |
Characters and Unicode
| Total characters | 506727 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | annually |
|---|---|
| 2nd row | never pay |
| 3rd row | per bucket |
| 4th row | never pay |
| 5th row | never pay |
| Value | Count | Frequency (%) |
| never pay | 25348 | |
| per bucket | 8985 | 15.1% |
| monthly | 8300 | 14.0% |
| unknown | 8157 | 13.7% |
| on failure | 3914 | 6.6% |
| annually | 3642 | 6.1% |
| other | 1054 | 1.8% |
| Value | Count | Frequency (%) |
| pay | 25348 | |
| never | 25348 | |
| bucket | 8985 | 9.2% |
| per | 8985 | 9.2% |
| monthly | 8300 | 8.5% |
| unknown | 8157 | 8.4% |
| on | 3914 | 4.0% |
| failure | 3914 | 4.0% |
| annually | 3642 | 3.7% |
| other | 1054 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | 7.8% |
| 38247 | 7.5% | |
| y | 37290 | 7.4% |
| a | 36546 | 7.2% |
| p | 34333 | 6.8% |
| v | 25348 | 5.0% |
| u | 24698 | 4.9% |
| o | 21425 | 4.2% |
| Other values (10) | 106588 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 468480 | |
| Space Separator | 38247 | 7.5% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | |
| y | 37290 | |
| a | 36546 | |
| p | 34333 | 7.3% |
| v | 25348 | 5.4% |
| u | 24698 | 5.3% |
| o | 21425 | 4.6% |
| l | 19498 | 4.2% |
| Other values (9) | 87090 |
| Value | Count | Frequency (%) |
| 38247 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 468480 | |
| Common | 38247 | 7.5% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | |
| y | 37290 | |
| a | 36546 | |
| p | 34333 | 7.3% |
| v | 25348 | 5.4% |
| u | 24698 | 5.3% |
| o | 21425 | 4.6% |
| l | 19498 | 4.2% |
| Other values (9) | 87090 |
| Value | Count | Frequency (%) |
| 38247 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 506727 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | 7.8% |
| 38247 | 7.5% | |
| y | 37290 | 7.4% |
| a | 36546 | 7.2% |
| p | 34333 | 6.8% |
| v | 25348 | 5.0% |
| u | 24698 | 4.9% |
| o | 21425 | 4.2% |
| Other values (10) | 106588 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| soft | |
|---|---|
| salty | 4856 |
| unknown | 1876 |
| milky | 804 |
| coloured | 490 |
| Other values (3) | 556 |
Length
| Max length | 18 |
|---|---|
| Median length | 4 |
| Mean length | 4.303282828 |
| Min length | 4 |
Characters and Unicode
| Total characters | 255615 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | soft |
|---|---|
| 2nd row | soft |
| 3rd row | soft |
| 4th row | soft |
| 5th row | soft |
| Value | Count | Frequency (%) |
| soft | 50818 | |
| salty | 4856 | 8.2% |
| unknown | 1876 | 3.2% |
| milky | 804 | 1.4% |
| coloured | 490 | 0.8% |
| salty abandoned | 339 | 0.6% |
| fluoride | 200 | 0.3% |
| fluoride abandoned | 17 | < 0.1% |
| Value | Count | Frequency (%) |
| soft | 50818 | |
| salty | 5195 | 8.7% |
| unknown | 1876 | 3.1% |
| milky | 804 | 1.3% |
| coloured | 490 | 0.8% |
| abandoned | 356 | 0.6% |
| fluoride | 217 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.3% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (9) | 8092 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 255259 | |
| Space Separator | 356 | 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.4% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (8) | 7736 | 3.0% |
| Value | Count | Frequency (%) |
| 356 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 255259 | |
| Common | 356 | 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.4% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (8) | 7736 | 3.0% |
| Value | Count | Frequency (%) |
| 356 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 255615 |
Most frequent character per block
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.3% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (9) | 8092 | 3.2% |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| good | |
|---|---|
| salty | |
| unknown | 1876 |
| milky | 804 |
| colored | 490 |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 4.23510101 |
| Min length | 4 |
Characters and Unicode
| Total characters | 251565 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | good |
|---|---|
| 2nd row | good |
| 3rd row | good |
| 4th row | good |
| 5th row | good |
| Value | Count | Frequency (%) |
| good | 50818 | |
| salty | 5195 | 8.7% |
| unknown | 1876 | 3.2% |
| milky | 804 | 1.4% |
| colored | 490 | 0.8% |
| fluoride | 217 | 0.4% |
| Value | Count | Frequency (%) |
| good | 50818 | |
| salty | 5195 | 8.7% |
| unknown | 1876 | 3.2% |
| milky | 804 | 1.4% |
| colored | 490 | 0.8% |
| fluoride | 217 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| s | 5195 | 2.1% |
| a | 5195 | 2.1% |
| t | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 251565 |
Most frequent character per category
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| s | 5195 | 2.1% |
| a | 5195 | 2.1% |
| t | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 251565 |
Most frequent character per script
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| s | 5195 | 2.1% |
| a | 5195 | 2.1% |
| t | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 251565 |
Most frequent character per block
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| s | 5195 | 2.1% |
| a | 5195 | 2.1% |
| t | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | |
| seasonal | |
| unknown | 789 |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.362373737 |
| Min length | 3 |
Characters and Unicode
| Total characters | 437325 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | enough |
|---|---|
| 2nd row | insufficient |
| 3rd row | enough |
| 4th row | dry |
| 5th row | seasonal |
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 437325 |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 437325 |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 437325 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | |
| seasonal | |
| unknown | 789 |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.362373737 |
| Min length | 3 |
Characters and Unicode
| Total characters | 437325 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | enough |
|---|---|
| 2nd row | insufficient |
| 3rd row | enough |
| 4th row | dry |
| 5th row | seasonal |
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 437325 |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 437325 |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 437325 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| c | 15129 | 3.5% |
| Other values (8) | 47595 |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| spring | |
|---|---|
| shallow well | |
| machine dbh | |
| river | |
| rainwater harvesting | |
| Other values (5) |
Length
| Max length | 20 |
|---|---|
| Median length | 11 |
| Mean length | 8.978804714 |
| Min length | 3 |
Characters and Unicode
| Total characters | 533341 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | spring |
|---|---|
| 2nd row | rainwater harvesting |
| 3rd row | dam |
| 4th row | machine dbh |
| 5th row | rainwater harvesting |
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow well | 16824 | |
| machine dbh | 11075 | |
| river | 9612 | |
| rainwater harvesting | 2295 | 3.9% |
| hand dtw | 874 | 1.5% |
| lake | 765 | 1.3% |
| dam | 656 | 1.1% |
| other | 212 | 0.4% |
| unknown | 66 | 0.1% |
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow | 16824 | |
| well | 16824 | |
| dbh | 11075 | |
| machine | 11075 | |
| river | 9612 | |
| rainwater | 2295 | 2.5% |
| harvesting | 2295 | 2.5% |
| hand | 874 | 1.0% |
| dtw | 874 | 1.0% |
| Other values (4) | 1699 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | 8.1% |
| e | 43078 | 8.1% |
| h | 42355 | 7.9% |
| i | 42298 | 7.9% |
| a | 37079 | 7.0% |
| w | 36883 | 6.9% |
| s | 36140 | 6.8% |
| n | 33758 | 6.3% |
| 31068 | 5.8% | |
| Other values (11) | 119279 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 502273 | |
| Space Separator | 31068 | 5.8% |
Most frequent character per category
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | |
| e | 43078 | |
| h | 42355 | |
| i | 42298 | |
| a | 37079 | 7.4% |
| w | 36883 | 7.3% |
| s | 36140 | 7.2% |
| n | 33758 | 6.7% |
| g | 19316 | 3.8% |
| Other values (10) | 99963 |
| Value | Count | Frequency (%) |
| 31068 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 502273 | |
| Common | 31068 | 5.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | |
| e | 43078 | |
| h | 42355 | |
| i | 42298 | |
| a | 37079 | 7.4% |
| w | 36883 | 7.3% |
| s | 36140 | 7.2% |
| n | 33758 | 6.7% |
| g | 19316 | 3.8% |
| Other values (10) | 99963 |
| Value | Count | Frequency (%) |
| 31068 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 533341 |
Most frequent character per block
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | 8.1% |
| e | 43078 | 8.1% |
| h | 42355 | 7.9% |
| i | 42298 | 7.9% |
| a | 37079 | 7.0% |
| w | 36883 | 6.9% |
| s | 36140 | 6.8% |
| n | 33758 | 6.3% |
| 31068 | 5.8% | |
| Other values (11) | 119279 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| spring | |
|---|---|
| shallow well | |
| borehole | |
| river/lake | |
| rainwater harvesting | |
| Other values (2) | 934 |
Length
| Max length | 20 |
|---|---|
| Median length | 8 |
| Mean length | 9.303602694 |
| Min length | 3 |
Characters and Unicode
| Total characters | 552634 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | spring |
|---|---|
| 2nd row | rainwater harvesting |
| 3rd row | dam |
| 4th row | borehole |
| 5th row | rainwater harvesting |
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow well | 16824 | |
| borehole | 11949 | |
| river/lake | 10377 | |
| rainwater harvesting | 2295 | 3.9% |
| dam | 656 | 1.1% |
| other | 278 | 0.5% |
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow | 16824 | |
| well | 16824 | |
| borehole | 11949 | |
| river/lake | 10377 | |
| rainwater | 2295 | 2.9% |
| harvesting | 2295 | 2.9% |
| dam | 656 | 0.8% |
| other | 278 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | 7.4% |
| s | 36140 | 6.5% |
| w | 35943 | 6.5% |
| a | 34742 | 6.3% |
| i | 31988 | 5.8% |
| h | 31346 | 5.7% |
| n | 21611 | 3.9% |
| Other values (10) | 107011 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 523138 | |
| Space Separator | 19119 | 3.5% |
| Other Punctuation | 10377 | 1.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | |
| s | 36140 | |
| w | 35943 | |
| a | 34742 | 6.6% |
| i | 31988 | 6.1% |
| h | 31346 | 6.0% |
| n | 21611 | 4.1% |
| Other values (8) | 77515 |
| Value | Count | Frequency (%) |
| 19119 |
| Value | Count | Frequency (%) |
| / | 10377 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 523138 | |
| Common | 29496 | 5.3% |
Most frequent character per script
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | |
| s | 36140 | |
| w | 35943 | |
| a | 34742 | 6.6% |
| i | 31988 | 6.1% |
| h | 31346 | 6.0% |
| n | 21611 | 4.1% |
| Other values (8) | 77515 |
| Value | Count | Frequency (%) |
| 19119 | ||
| / | 10377 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 552634 |
Most frequent character per block
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | 7.4% |
| s | 36140 | 6.5% |
| w | 35943 | 6.5% |
| a | 34742 | 6.3% |
| i | 31988 | 5.8% |
| h | 31346 | 5.7% |
| n | 21611 | 3.9% |
| Other values (10) | 107011 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| groundwater | |
|---|---|
| surface | |
| unknown | 278 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.08377104 |
| Min length | 7 |
Characters and Unicode
| Total characters | 598976 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | groundwater |
|---|---|
| 2nd row | surface |
| 3rd row | surface |
| 4th row | groundwater |
| 5th row | surface |
| Value | Count | Frequency (%) |
| groundwater | 45794 | |
| surface | 13328 | 22.4% |
| unknown | 278 | 0.5% |
| Value | Count | Frequency (%) |
| groundwater | 45794 | |
| surface | 13328 | 22.4% |
| unknown | 278 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 598976 |
Most frequent character per category
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 598976 |
Most frequent character per script
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 598976 |
Most frequent character per block
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| communal standpipe multiple | |
| improved spring | 784 |
| Other values (2) | 123 |
Length
| Max length | 27 |
|---|---|
| Median length | 18 |
| Mean length | 14.82757576 |
| Min length | 3 |
Characters and Unicode
| Total characters | 880758 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | communal standpipe |
|---|---|
| 2nd row | communal standpipe |
| 3rd row | communal standpipe multiple |
| 4th row | communal standpipe multiple |
| 5th row | communal standpipe |
| Value | Count | Frequency (%) |
| communal standpipe | 28522 | |
| hand pump | 17488 | |
| other | 6380 | 10.7% |
| communal standpipe multiple | 6103 | 10.3% |
| improved spring | 784 | 1.3% |
| cattle trough | 116 | 0.2% |
| dam | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| standpipe | 34625 | |
| communal | 34625 | |
| hand | 17488 | |
| pump | 17488 | |
| other | 6380 | 5.4% |
| multiple | 6103 | 5.1% |
| improved | 784 | 0.7% |
| spring | 784 | 0.7% |
| trough | 116 | 0.1% |
| cattle | 116 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| 59116 | 6.7% | |
| u | 58332 | 6.6% |
| d | 52904 | 6.0% |
| e | 48008 | 5.5% |
| t | 47456 | 5.4% |
| l | 46947 | 5.3% |
| Other values (8) | 188083 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 821642 | |
| Space Separator | 59116 | 6.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| u | 58332 | |
| d | 52904 | 6.4% |
| e | 48008 | 5.8% |
| t | 47456 | 5.8% |
| l | 46947 | 5.7% |
| i | 42296 | 5.1% |
| Other values (7) | 145787 |
| Value | Count | Frequency (%) |
| 59116 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 821642 | |
| Common | 59116 | 6.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| u | 58332 | |
| d | 52904 | 6.4% |
| e | 48008 | 5.8% |
| t | 47456 | 5.8% |
| l | 46947 | 5.7% |
| i | 42296 | 5.1% |
| Other values (7) | 145787 |
| Value | Count | Frequency (%) |
| 59116 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 880758 |
Most frequent character per block
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| 59116 | 6.7% | |
| u | 58332 | 6.6% |
| d | 52904 | 6.0% |
| e | 48008 | 5.5% |
| t | 47456 | 5.4% |
| l | 46947 | 5.3% |
| Other values (8) | 188083 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| improved spring | 784 |
| cattle trough | 116 |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 13.90287879 |
| Min length | 3 |
Characters and Unicode
| Total characters | 825831 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | communal standpipe |
|---|---|
| 2nd row | communal standpipe |
| 3rd row | communal standpipe |
| 4th row | communal standpipe |
| 5th row | communal standpipe |
| Value | Count | Frequency (%) |
| communal standpipe | 34625 | |
| hand pump | 17488 | |
| other | 6380 | 10.7% |
| improved spring | 784 | 1.3% |
| cattle trough | 116 | 0.2% |
| dam | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| standpipe | 34625 | |
| communal | 34625 | |
| hand | 17488 | |
| pump | 17488 | |
| other | 6380 | 5.7% |
| improved | 784 | 0.7% |
| spring | 784 | 0.7% |
| trough | 116 | 0.1% |
| cattle | 116 | 0.1% |
| dam | 7 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| 53013 | 6.4% | |
| d | 52904 | 6.4% |
| u | 52229 | 6.3% |
| o | 41905 | 5.1% |
| e | 41905 | 5.1% |
| t | 41353 | 5.0% |
| Other values (8) | 174816 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 772818 | |
| Space Separator | 53013 | 6.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| d | 52904 | 6.8% |
| u | 52229 | 6.8% |
| o | 41905 | 5.4% |
| e | 41905 | 5.4% |
| t | 41353 | 5.4% |
| i | 36193 | 4.7% |
| Other values (7) | 138623 |
| Value | Count | Frequency (%) |
| 53013 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 772818 | |
| Common | 53013 | 6.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| d | 52904 | 6.8% |
| u | 52229 | 6.8% |
| o | 41905 | 5.4% |
| e | 41905 | 5.4% |
| t | 41353 | 5.4% |
| i | 36193 | 4.7% |
| Other values (7) | 138623 |
| Value | Count | Frequency (%) |
| 53013 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 825831 |
Most frequent character per block
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| 53013 | 6.4% | |
| d | 52904 | 6.4% |
| u | 52229 | 6.3% |
| o | 41905 | 5.1% |
| e | 41905 | 5.1% |
| t | 41353 | 5.0% |
| Other values (8) | 174816 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.1 KiB |
| functional | |
|---|---|
| non functional | |
| functional needs repair |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 12.48176768 |
| Min length | 10 |
Characters and Unicode
| Total characters | 741417 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | functional |
|---|---|
| 2nd row | functional |
| 3rd row | functional |
| 4th row | non functional |
| 5th row | functional |
| Value | Count | Frequency (%) |
| functional | 32259 | |
| non functional | 22824 | |
| functional needs repair | 4317 | 7.3% |
| Value | Count | Frequency (%) |
| functional | 59400 | |
| non | 22824 | 25.1% |
| needs | 4317 | 4.8% |
| repair | 4317 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 168765 | |
| o | 82224 | |
| i | 63717 | 8.6% |
| a | 63717 | 8.6% |
| f | 59400 | 8.0% |
| u | 59400 | 8.0% |
| c | 59400 | 8.0% |
| t | 59400 | 8.0% |
| l | 59400 | 8.0% |
| 31458 | 4.2% | |
| Other values (5) | 34536 | 4.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 709959 | |
| Space Separator | 31458 | 4.2% |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 168765 | |
| o | 82224 | |
| i | 63717 | 9.0% |
| a | 63717 | 9.0% |
| f | 59400 | 8.4% |
| u | 59400 | 8.4% |
| c | 59400 | 8.4% |
| t | 59400 | 8.4% |
| l | 59400 | 8.4% |
| e | 12951 | 1.8% |
| Other values (4) | 21585 | 3.0% |
| Value | Count | Frequency (%) |
| 31458 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 709959 | |
| Common | 31458 | 4.2% |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 168765 | |
| o | 82224 | |
| i | 63717 | 9.0% |
| a | 63717 | 9.0% |
| f | 59400 | 8.4% |
| u | 59400 | 8.4% |
| c | 59400 | 8.4% |
| t | 59400 | 8.4% |
| l | 59400 | 8.4% |
| e | 12951 | 1.8% |
| Other values (4) | 21585 | 3.0% |
| Value | Count | Frequency (%) |
| 31458 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 741417 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 168765 | |
| o | 82224 | |
| i | 63717 | 8.6% |
| a | 63717 | 8.6% |
| f | 59400 | 8.0% |
| u | 59400 | 8.0% |
| c | 59400 | 8.0% |
| t | 59400 | 8.0% |
| l | 59400 | 8.0% |
| 31458 | 4.2% | |
| Other values (5) | 34536 | 4.7% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | status_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 69572 | 6000.0 | 2011-03-14 | Roman | 1390 | Roman | 34.938093 | -9.856322 | none | 0 | Lake Nyasa | Mnyusi B | Iringa | 11 | 5 | Ludewa | Mundindi | 109 | True | GeoData Consultants Ltd | VWC | Roman | False | 1999 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | functional |
| 1 | 8776 | 0.0 | 2013-03-06 | Grumeti | 1399 | GRUMETI | 34.698766 | -2.147466 | Zahanati | 0 | Lake Victoria | Nyamara | Mara | 20 | 2 | Serengeti | Natta | 280 | NaN | GeoData Consultants Ltd | Other | NaN | True | 2010 | gravity | gravity | gravity | wug | user-group | never pay | never pay | soft | good | insufficient | insufficient | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe | functional |
| 2 | 34310 | 25.0 | 2013-02-25 | Lottery Club | 686 | World vision | 37.460664 | -3.821329 | Kwa Mahundi | 0 | Pangani | Majengo | Manyara | 21 | 4 | Simanjiro | Ngorika | 250 | True | GeoData Consultants Ltd | VWC | Nyumba ya mungu pipe scheme | True | 2009 | gravity | gravity | gravity | vwc | user-group | pay per bucket | per bucket | soft | good | enough | enough | dam | dam | surface | communal standpipe multiple | communal standpipe | functional |
| 3 | 67743 | 0.0 | 2013-01-28 | Unicef | 263 | UNICEF | 38.486161 | -11.155298 | Zahanati Ya Nanyumbu | 0 | Ruvuma / Southern Coast | Mahakamani | Mtwara | 90 | 63 | Nanyumbu | Nanyumbu | 58 | True | GeoData Consultants Ltd | VWC | NaN | True | 1986 | submersible | submersible | submersible | vwc | user-group | never pay | never pay | soft | good | dry | dry | machine dbh | borehole | groundwater | communal standpipe multiple | communal standpipe | non functional |
| 4 | 19728 | 0.0 | 2011-07-13 | Action In A | 0 | Artisan | 31.130847 | -1.825359 | Shuleni | 0 | Lake Victoria | Kyanyamisa | Kagera | 18 | 1 | Karagwe | Nyakasimbi | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | gravity | gravity | gravity | other | other | never pay | never pay | soft | good | seasonal | seasonal | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe | functional |
| 5 | 9944 | 20.0 | 2011-03-13 | Mkinga Distric Coun | 0 | DWE | 39.172796 | -4.765587 | Tajiri | 0 | Pangani | Moa/Mwereme | Tanga | 4 | 8 | Mkinga | Moa | 1 | True | GeoData Consultants Ltd | VWC | Zingibali | True | 2009 | submersible | submersible | submersible | vwc | user-group | pay per bucket | per bucket | salty | salty | enough | enough | other | other | unknown | communal standpipe multiple | communal standpipe | functional |
| 6 | 19816 | 0.0 | 2012-10-01 | Dwsp | 0 | DWSP | 33.362410 | -3.766365 | Kwa Ngomho | 0 | Internal | Ishinabulandi | Shinyanga | 17 | 3 | Shinyanga Rural | Samuye | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump | non functional |
| 7 | 54551 | 0.0 | 2012-10-09 | Rwssp | 0 | DWE | 32.620617 | -4.226198 | Tushirikiane | 0 | Lake Tanganyika | Nyawishi Center | Shinyanga | 17 | 3 | Kahama | Chambo | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | wug | user-group | unknown | unknown | milky | milky | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | non functional |
| 8 | 53934 | 0.0 | 2012-11-03 | Wateraid | 0 | Water Aid | 32.711100 | -5.146712 | Kwa Ramadhan Musa | 0 | Lake Tanganyika | Imalauduki | Tabora | 14 | 6 | Tabora Urban | Itetemia | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | india mark ii | india mark ii | handpump | vwc | user-group | never pay | never pay | salty | salty | seasonal | seasonal | machine dbh | borehole | groundwater | hand pump | hand pump | non functional |
| 9 | 46144 | 0.0 | 2011-08-03 | Isingiro Ho | 0 | Artisan | 30.626991 | -1.257051 | Kwapeto | 0 | Lake Victoria | Mkonomre | Kagera | 18 | 1 | Karagwe | Kaisho | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | functional |
Last rows
| id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | status_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 59390 | 13677 | 0.0 | 2011-08-04 | Rudep | 1715 | DWE | 31.370848 | -8.258160 | Kwa Mzee Atanas | 0 | Lake Tanganyika | Kitonto | Rukwa | 15 | 2 | Sumbawanga Rural | Mkowe | 150 | True | GeoData Consultants Ltd | VWC | NaN | False | 1991 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | machine dbh | borehole | groundwater | hand pump | hand pump | functional |
| 59391 | 44885 | 0.0 | 2013-08-03 | Government Of Tanzania | 540 | Government | 38.044070 | -4.272218 | Kwa | 0 | Pangani | Maore Kati | Kilimanjaro | 3 | 3 | Same | Maore | 210 | True | GeoData Consultants Ltd | Water authority | Hingilili | True | 1967 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe | non functional |
| 59392 | 40607 | 0.0 | 2011-04-15 | Government Of Tanzania | 0 | Government | 33.009440 | -8.520888 | Benard Charles | 0 | Lake Rukwa | Mbuyuni A | Mbeya | 12 | 1 | Chunya | Mbuyuni | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | non functional |
| 59393 | 48348 | 0.0 | 2012-10-27 | Private | 0 | Private | 33.866852 | -4.287410 | Kwa Peter | 0 | Internal | Masanga | Tabora | 14 | 2 | Igunga | Igunga | 0 | False | GeoData Consultants Ltd | Water authority | NaN | False | 0 | gravity | gravity | gravity | private operator | commercial | pay per bucket | per bucket | soft | good | insufficient | insufficient | dam | dam | surface | other | other | functional |
| 59394 | 11164 | 500.0 | 2011-03-09 | World Bank | 351 | ML appro | 37.634053 | -6.124830 | Chimeredya | 0 | Wami / Ruvu | Komstari | Morogoro | 5 | 6 | Mvomero | Diongoya | 89 | True | GeoData Consultants Ltd | VWC | NaN | True | 2007 | submersible | submersible | submersible | vwc | user-group | pay monthly | monthly | soft | good | enough | enough | machine dbh | borehole | groundwater | communal standpipe | communal standpipe | non functional |
| 59395 | 60739 | 10.0 | 2013-05-03 | Germany Republi | 1210 | CES | 37.169807 | -3.253847 | Area Three Namba 27 | 0 | Pangani | Kiduruni | Kilimanjaro | 3 | 5 | Hai | Masama Magharibi | 125 | True | GeoData Consultants Ltd | Water Board | Losaa Kia water supply | True | 1999 | gravity | gravity | gravity | water board | user-group | pay per bucket | per bucket | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | functional |
| 59396 | 27263 | 4700.0 | 2011-05-07 | Cefa-njombe | 1212 | Cefa | 35.249991 | -9.070629 | Kwa Yahona Kuvala | 0 | Rufiji | Igumbilo | Iringa | 11 | 4 | Njombe | Ikondo | 56 | True | GeoData Consultants Ltd | VWC | Ikondo electrical water sch | True | 1996 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe | functional |
| 59397 | 37057 | 0.0 | 2011-04-11 | NaN | 0 | NaN | 34.017087 | -8.750434 | Mashine | 0 | Rufiji | Madungulu | Mbeya | 12 | 7 | Mbarali | Chimala | 0 | True | GeoData Consultants Ltd | VWC | NaN | False | 0 | swn 80 | swn 80 | handpump | vwc | user-group | pay monthly | monthly | fluoride | fluoride | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump | functional |
| 59398 | 31282 | 0.0 | 2011-03-08 | Malec | 0 | Musa | 35.861315 | -6.378573 | Mshoro | 0 | Rufiji | Mwinyi | Dodoma | 1 | 4 | Chamwino | Mvumi Makulu | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | shallow well | shallow well | groundwater | hand pump | hand pump | functional |
| 59399 | 26348 | 0.0 | 2011-03-23 | World Bank | 191 | World | 38.104048 | -6.747464 | Kwa Mzee Lugawa | 0 | Wami / Ruvu | Kikatanyemba | Morogoro | 5 | 2 | Morogoro Rural | Ngerengere | 150 | True | GeoData Consultants Ltd | VWC | NaN | True | 2002 | nira/tanira | nira/tanira | handpump | vwc | user-group | pay when scheme fails | on failure | salty | salty | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | functional |